Voice control of electronic systems is not a new invention; however its development has been hindered until 10 years ago by the lack of powerful enough (and dedicated) digital signal controllers). Things are changing though, and with enough money on the market to create a demand for voice controlled application, several semiconductor suppliers have started to target complete such solutions through their research efforts. One of these companies is Freescale and one of the product ranges destined to be the heart of such a solution is the 56800E Core 16-bit Digital Signal Controller Family (software development supported by the CodeWarrior Development Tools).
Improved control of a device with voice commands is now possible using hybrid architecture that combines a microcontroller (MCU) with a digital signal processor (DSP). The digital processing capabilities of a hybrid MCU allow voice control to penetrate to embedded systems. Practically any new product containing a hybrid MCU can be controlled by voice. A system for control of lights and heating can be generalized for controlling any device with a voice command set. Software algorithms may be developed using the proven hidden Markov models.
The standard solution proposed by Freescale is centered on a DSP56F805, which provides a satisfactory 32K flash memory to implement algorithms of any complexity required. Additional features which worth being mentioned are:
• Up to 40 MIPS at 80MHz core frequency
• DSP and MCU functionality in a unified, C-efficient architecture
• Hardware DO and REP loops
• MCU-friendly instruction set supports both DSP and controller functions: MAC, bit manipulation unit, 14 addressing modes
• 31.5K × 16-bit words (64KB) Program Flash
• 512 × 16-bit words (1KB) Program RAM
• 4K × 16-bit words (8KB) Data Flash
• 2K × 16-bit words (4KB) Data RAM
• 2K × 16-bit words (4KB) Boot Flash
• Up to 64K × 16-bit words (128KB) each of external Program and Data memory
• Two 6-channel PWM Modules
• Two 4-channel, 12-bit ADCs
• Two Quadrature Decoders
• CAN 2.0 B Module
• Two Serial Communication Interfaces (SCIs)
• Serial Peripheral Interface (SPI)
• Up to four General Purpose Quad Timers
• JTAG/OnCETM port for debugging
• 14 Dedicated and 18 Shared GPIO lines
• 144-pin LQFP Package
The declared aim of the application is to provide a reference architectural design for controlling home appliances via voice commands. Speaker dependence is one of the main aspects to be considered when dealing with speech recognition; the Freescale system has the ability to fine-tune speaker dependence, making it able to respond only to commands pre-registered by a single user (who would have to record his own way of pronouncing words like “light”, “dark”, “heat”, “cold”, digits, time temperature – in a given format) or to general voice commands spoken by anybody in a room. The general view in consumer electronics however, is that making a system speaker dependent has more advantages than disadvantages.
As mentioned before, the core of the system is a DSP56F805, around which other components are added in order to make all system-controlling functions accessible by voice command, manual switch or keypad input (in case of noisy environment). Each of the microphones used for voice detection has a corresponding switch and a lamp unit, which raises the need of an arbitration process (easily implemented in software). A block diagram of the speech recognition system is available below.
The DSP continuously samples all analog-to-digital converter (ADC) inputs, the affiliated software process determines if there is a speech on input, and the DSP assigns an actual device. The ADC on the DSP56F805 has two modules, each multiplexed to four pins; a total of eight sources of analog signal. Optionally, one ADC pin can be connected to a phone through a subscriber line interface circuit (SLIC); one pin is connected to a temperature sensor. The remaining six pins can be connected to microphones. The ADC resolution is 12-bit, and maximum sampling frequency is 800 kHz, time-multiplexed sampling of all eight ADC channels. To minimize memory requirements, a sampling frequency of 8 kHz is recommended for all speech channels.
As an option, heating can be controlled by phone. The recognition process is the same. If the speech reference is recognized, the DSP returns it as an audio signal to confirm the validity of the recognition. The DSP is connected to a phone through a SLIC. Internal ADC is used at the input side and an external digital-to-analog converter (DAC) is used at the output side. The DAC and the DSP are connected through a serial peripheral interface (SPI).
One of the special characteristics of the solution is the possibility of changing in real time the parameters used for speech recognition, through the facilities provided by the SDK.
As you can imagine, a speech recognition system (voice control application is just a particular instance of speech recognition) has a quite complex software component. Running the vocal command recognition algorithm is probably the most complex of its tasks. This is done mainly employing algorithms based on the hidden Markov models, which were initially developed in the late 1960s and have been used in speech recognition applications since mid 1970s. Here is an illustrated summary of the activities performed by the software component of the application:
Although the reference design particularly refers to a design for home appliances control, the same generic principles may be applied to any device that needs to be controlled in such a manner.
Freescale is one of the dedicated automotive suppliers, and voice controlled vehicle features are the next hype in premium cars. Just by accident I came across the advertising page of Audi, which makes no secret about the powerful voice control system that it offers its customers.
Read the Italian version: Applicazioni di controllo vocale utilizzando l'architettura Freescale
If you want to know more about this Freescale product, please submit your request to Arrow Italy using this form.
NOTE: this form is valid ONLY for Companies or Customers based in Italy and working in the Italian area.