linux-audio-dev: RE: [linux-audio-dev] Re: Speech analysis

From: Lee Revell <rlrevell@email-addr-hidden-job.com>
Date: Mon Jun 06 2005 - 21:10:59 EEST

On Mon, 2005-06-06 at 10:43 -0700, Brad Arant wrote:
> Much of the latest speech recognition innovations use neural network
> technology with back propogation for training and learning. They can be
> trained to recognize a wide range of voice types and can detect works strung
> together into normal speech. The input to the neural net is a formant
> analysis using fft to create the harmonic pattern. With proper arrangment it
> will even accomodate variances in the speed of speech as well as whether the
> voice is male or female. It can also return a signal of the inflections made
> by the speaker.
>
> It is an item that has been studied for years in the computer science realm
> and there is no quick solution to do it well.

Actually the interesting work is in the perception / cognitive
psychology area. Once you have this the CS side is pretty simple.

Lee
Received on Tue Jun 7 00:15:10 2005

This archive was generated by hypermail 2.1.8 : Tue Jun 07 2005 - 00:15:10 EEST