Speech processing is one of the largest growing research areas in signal processing. Each year billions of pounds are being spent on supporting research in speech processing.The ultimate aim of this research is to provide an interactive man-machine communication. Speech is a special form of communication medium; it conveys not only the meaning but it also expresses the emotion of the speaker and individual information about the speaker.
During the past few years, the vast number of research and development in speech processing brought up changes in our everyday life. There are commercially available products which are based on Automatic Speech Recognition, Speaker Verification, Speaker Identification and Speech Synthesizer. For a example the compag personal computer has a buil-in speech processor which executes restricted number of spoken voice commands. This advanced technology is based on the mechanism involed in human speech production and perception. In this article particular emphasis is given to speech production.
The vocal tract and vocal cord play a major role in speech production. The vocal tract consists of several organs and muscles which are regularly monitored and carefuly controlled by the speech centers. The precise controlling is achvieved by internal feedback in the brain. As an example auditory feedback helps us to ensure that we are producing the correct speech sounds and that they are of the correct intensity for the environment. Speech sounds are produced when air is exhaled from the lungs and causes either vibration of vocal cord or turbulence at some point of contriction in the vocal tract. The shape of the vocal tract influences the sound harmonics. The way in which the vocal cord is vibrated and the shape of the vocal tract is varied in order to produce a range of speech sounds with which we are familiar .
From the linguistic point of view the smallest speech unit is known as phonemes, which indicates a different in meaning and is normally written between slashes as for example /m/ in hum. In fact the sounds produced for individual phonemes vary depending on where it appears in a word, phonemes sets are different for different languages, as for example about 40 phonemes are sufficient to discriminate between all the sounds made in British english.
Phonemes are characterised in to six different groups. These are the vowels ,dipthongs,semi vowels, stop constant, fricative and affricative. The grouping of these phonemes is based on the way these sounds are produced. Each phonemes is a combined version of the first three dominant formanat frequency which is originated due to vibration of the vocal cord. However the formanat frequency largely vary depending on the speaker.
The Scientist and Engineers have understood the basic concepts behind the anatomy and physiology of speech production and perception. But the lack of understanding of the interaction of the brain with vocal tract and auditory apparatus prevents Engineers from designing machines, which will be able to understand and speak like ordinary human beings.