Hands-free Speech Telecommunication

By Kalpesh Chauhan


The aim of hands-free speech telecommunication

Hands-free speech telecommunication provides what we are already used to with conventional telephones, however without any restriction upon one's physical location. Thus instead of bringing ourselves to a telephone handset, we bring the elements of a telephone handset to ourselves, namely that of a microphone and a loudspeaker for our voice and ears respectively. Such a solution allows for a much more natural means of communication, and also allows free movement within one's environment, while still being able to hold a normal conversation. That is we must provide full-duplex speech communication, in contrast to half-duplex speech communication, as used in CB radio.

Why hands-free communication?

Apart from the convenience point of view, one of the main driving forces for continued research into hands-free telecommunication stemmed from the rapid growth of mobile radiotelephony. No longer was the telephone solely associated within the home or the office. One of the first environments to become widely populated by mobile radiotelephony was within the car, allowing people to keep in touch while on the move, or while stuck in traffic. Due to the obvious dangers of holding a telephone in one hand, and driving with the other, many of the developed countries either strongly recommended, or legally enforced hands-free telephone operation in all moving vehicles. Thus within research departments all over the world, the race to develop a hand-free telephone system which provided the same quality of speech as conventional fixed telephones was further fuelled. One should note that although not mandatory, other equally important uses exist for hands-free telecommunications, for example between offices, conference suites, and medical theatres.

The technological requirements

In terms of the technological requirements, regardless of the environment, the same basic problems need to be overcome. Conventional telephones have the loudspeaker located very close to the ear, and thus the power output of the loudspeaker can be in milli-watts. Consequently the microphone does not detect any of the loudspeaker output. Now consider fitting the cabin of a car with a loudspeaker, to perform the same function as that in a telephone handset. Because of the increase in distance between the loudspeaker and the ear, we need a sufficient power output in order to overcome any background noise, such as that generated by the tyres and the engine. Thus what is needed are watts rather than milli-watts. Now as the microphone must also be fitted within the cabin of the car, we need to ensure that we alleviate the problem of feedback, as this makes communication very hard, if not impossible. If the microphones at both ends of a communication link detect their local loudspeaker output, repeated amplification takes place, causing the howling associated with feedback. Use of highly directional microphones and sound absorbing material, greatly minimises the feedback problem, however the use of such microphones goes against the idea of hands-free communication. Thus the required solution necessitates the removal of any of the loudspeaker output, from the microphone output.

At first thought, a simple solution would be to use the signal driving the loudspeaker, and subtract this at an appropriate amplitude from the microphone output signal. However this approach assumes the microphone detects exactly the same signals as that output by the loudspeaker. Fine if you're environment happens to resemble an anechoic chamber, but problematic elsewhere.

Figure 1 - An initial proposal.

Instead, what is needed is a means of adjusting the microphone output signal, in relation to the acoustic echoes. However by considering the acoustic response of a simple room, one can soon realise the complexity of the problem. Considering a room as a system, with an input signal of speech, and an output signal consisting of acoustic echoes, we can think of the room as having an impulse response. Simply by drawing the curtains, or closing the door in the room, the impulse response is significantly altered.

Figure 2 - Car cabin responses with just one person. Above, with two hands on the wheel. Below, with one hand raised.

To complicate the problem even further, the fact that many different echo paths exist in a room results in a single impulse producing several thousand impulse responses.

Figure 3 - The impulse response of an acoustic environment due to many echo paths

Now as hands-free telecommunication imposes no restriction upon the location and direction of one's voice, the hands-free system is tremendously complicated by having an unknown and ever-changing location of the source signal. Town driving involves much head movement, and many objects can be placed within a car, illustrating that the acoustic characteristics can never be taken as constant. As if this wasn't enough, the acoustic characteristics of a room are also dependent upon the amplitude and frequency of the source signal, speech being highly variable in both aspects. Thus not only do we need acoustic-echo cancellation, we need it to be rapidly adaptive to an ever-changing acoustic environment, and signal source.

Current hands-free technology

Existing telephones featuring hands-free operation usually only offer half-duplex speech communication, avoiding the feedback problem by disabling the microphone while the local loudspeaker is active. Such a solution is far from ideal, as it is significantly affected by background noise. The switching between each telephone is performed using the amplitude of the microphone output signal, rather than by holding down a button as in CB radio. Thus the phone located in the loudest environment at any one time is transmitting speech, while the other is broadcasting speech. Switching is supposed to be controlled by the voices of the speaking parties, but the many sources of background noise fool the telephones into switching. Thus such gain switching is inadequate for use within an automotive environment.

Looking to the future

One can thus deduce that an electronic rather than physical means of removing any acoustic echoes is required. In order to perform such rapid dynamic environmental adaptation, use is made of complex mathematical algorithms, involving thousands of modelling coefficients. Considering that the adaptation must be performed in real-time, one can conceive that a significant amount of processing power is required to keep the modelling coefficients of the algorithm in match with that of the environment. Tests in research labs around the world have shown very promising results, using a range of algorithms, and with further refinement, one can soon expect to see the availability of true hands-free communication.

Source references

The hands-free telephone problem - An annotated bibliography Eberhard Hansler, Signal Processing 27 (1992) 259-271, 1992 Elsevier Science Publishers B.V.

Acoustic echo canceller for hands-free mobile radiotelephony Soren Holdt Jensen, Signal Processing VI : Theories and Applications, 1992 Elsevier Science Publishers B.V.

Enhancement of hands-free telecommunications P.Naylor, J.Alcazar, J.Boudy, Y.Grenier, Freetel project (Esprit III). ANN. TELECOMMUN., 49 no 7-8 1994.