Home
Login / Signup
 di 

How to process audio WAV files in embedded systems

How to process audio WAV files in embedded systems

Let's see how we can process audio WAV files in embedded systems.
A little while ago I was asked to make a small circuit based around a microcontroller and a Digital to Analog Converter that would output various sounds corresponding to different stimuli. These stimuli would be commands received by the microcontroller via UART, and the output sounds would simulate a switch, or a closed door, or other ambient sounds from a working environment. Building the hardware circuit was not that difficult – see Figure 1 for block diagram (although I had to put in some serious efforts to keep the cost of the BOM as low as possible) but it was the first time I was faced with outputting sound! So I had to learn.

The first obstacle that I have encountered was to determine how the output sounds will really sound like. The description “I need a sound that will simulate the pressing of a button” is good enough, but… how do you really make it? Fortunately, there are many available websites providing WAV files to be used in PC applications. I came across these accidentally but I was happy to find there various sounds that fitted the description: Soundjay, Pachd, Soundsnap

Once I found the “digital” sounds that I needed I realized the major problem I was facing was the lack of memory. All the sounds on these websites are relatively good quality sounds, meaning they are sampled at 44.1 kHz and each sample is coded on 16 or 32 bits. Of course, this does not pose any problem for a PC, where you have plenty of memory available, but unfortunately for my poor “embedded system”, I realized that for a short good quality sound lasting about 200 ms (44.1 kHz with 16bit/sample) I needed…17640 bytes of ROM on chip. Due to cost constraints, I was bound not to use any memory chip external to the microcontroller, but a single sound took more than the entire program memory available (16kbyte)! Clearly, a trade-off had to be made. What I soon realized was that for such simple sounds, which are far from music or even speech, there was no need for such good quality. Or instead of 44.1 kHz sampling frequency I realized I could go lower. And even 16bit/sample seemed too much, I was sure I could go even lower (8bit/sample was my target).
Performing the calculations again, I was happy to realize that my new conditions allowed for a 200ms sound sampled at 20kHz with 8bit/sample to be stored in a mere 4000 bytes…which meant my 16kbyte program memory was enough for three sounds and the program itself. But so far all this was theory. All I had were these computations and some WAV files. The real challenge (for me at least – not being a software developer, but rather a hardware engineer) was to turn a WAV file in a C array to be used in the program running on the micro, something like:

unsigned char sound_samples[4000]={ 55, 66, 81, 83, 76, 66, 63, 62, 66, 62, 62, 82, 91, 85, 77, 70, 67, 68, 75, 80, 77, 79, 76, 74, 70, 63, 67, 70, 67, 65, 73, 71, 71, 72, 70, 61, 64, 59, 54, 62, 71, 73, 67, 67, 76, 81, 81, 70, …55, 43}

I also needed to translate the WAV file from a dual polarity signal centered on 0V into a positive signal, with the smallest sample value corresponding to 0V. I will describe the processing that is required to do this, organized in two steps:

Step 1 – processing the WAV file itself

First I had to convert the WAV file from 44.1 kHz to 20 kHz and for this I used one of the tools available online called Cool Edit. There are several trial versions available out there but the newer ones do not allow saving the modified file, so I had to turn to an older version of this tool, called Cool Edit 2000. An open source tool that might allow you to perform the same processing, in case you need it, is Audacity, available at: Audacity.Sourceforge

I will not detail the usage of these tools, as it is pretty straightforward (you launch the tool, open the WAV file, cut it to 200ms, convert it from stereo to mono, change the sampling frequency (from the top menu choose Tracks Resample and in the text box that opens write the new sampling frequency that you need – 20 kHz in my case) and then save the new wav file. If you play the new WAV file on the PC, you will be able to spot a small degeneration of the quality of the sound. Once this step was completed, I had the WAV file just as I wanted it, now I had to extract the samples from it!

Step 2 – Extracting the audio samples from the WAV file

This was a real problem, mainly because there is no available tool to do this (not any that I know of, anyway). So I had to write my own software that would do that and I had to do some reading about the WAV file format. Fortunately, none of these tasks is rocket science. The WAV file format is pretty simple:

The first 44 bytes in the WAV file contain a lot of information, but which is not of any use to us right now. Basically, these first 44 bytes form a header indicating (amongst others): the sample frequency (but we already know that: 20 kHz), the number of bits per sample (we know that too: 16bits/sample), the length of the “Data” section – meaning how many samples are in the file (but we know that too: 4000 samples, because I cut the file to 200ms in the audio editing tool at step 1). So what I really needed from this file was the data (all 4000 samples), which is to be found from the 44th byte in the file forth! For this, I had to write my own code using C# programming language, with a free (and open source too!) compiler known as Sharpdevelop and available under: Codeplex

Even if you are not a professional programmer, C# allows you to easily manipulate text and binary files. The program I wrote [see the attachment] takes a wav file as an input generating the C array that I need. It reads each sample from the wav file, adds a fixed amount to each samples (so that all samples become positive numbers), scales it down from 16 bit to 8 bit (a simple division by 256) and then creates a text file which contains the array in the standard C form:

unsigned char sound_samples[4000]={ 55, 66, 81, 83, 76, 66, 63, 62, 66, 62, 62, 82, 91, 85, 77, 70, 67, 68, 75, 80, 77, 79, 76, 74, 70, 63, 67, 70, 67, 65, 73, 71, 71, 72, 70, 61, 64, 59, 54, 62, 71, 73, 67, 67, 76, 81, 81, 70, …55, 43}

The effect of this processing can easily be illustrated with a excel graph, showing the sound signal before processing and the sound signal after processing. The graph shows in blue the raw samples (centered on 0), and in yellow the samples translated to a positive interval (by adding a fixed amount).

So, after dividing the translated samples by 256 (to convert them to 8 bit/sample) the C# program simply generates a text file containing the C array.

Outputting the sound to the speaker

Once this processing was completed, I was ready to embed this array in the C program that I wrote for the microcontroller. So far I had only theory and computer processing. But I had yet to hear a sound from the speaker of the embedded system. The microcontroller program was not so difficult to write, as I was familiar somewhat with this device. 20 kHz sampling frequency means each sample had to be sent to the DAC every 50us. This amount of time is more than enough to write one or two bytes via SPI (8 MHz clock frequency) and after a short effort I was delighted to hear in the speaker the sound that simulates the “pressing of a button”!

Repost: Oct 1st, 2008

AttachmentSize
WAV_Extract.zip16.09 KB

Audio wave files

I had the same kind of problem some time ago.
While working around I found a little help by converting the WAV to RAW, it removes the header leaving only audio data, thus saving a bit of space on your memory, and simplifying your filtering/conversion program.
I also used a 8bit 22Khz format. Conversion was accomplished using the usual program provided with soundblaster audio boards (was it Wave Studio?)......
So the only conversion was from raw data to C array declaration (as from your example).
I was lucky enough to work with a Renesas (formely Hitachi) H3048F equipped with an eight bit DAC, and DMA, so I could play messages while running the main program loop without blocking the system.
I still love this micro, it is very versatile, swiss army knife like. (Bit expensive though)
Anyway, it is fun to hear your microcontroller talk....

WAV or WAVE..

WAV is a IBM and Microsoft audio file format standard for storing an audio bitstream on your PC.

WAV is uncompressed audio

WAV is uncompressed audio format that requires more storage space than popular compressed audio formats like MP3, AAC and WMA. Equivalent uncompressed video format is Sony's D-1 while example compressed video formats are MPEG-2, MPEG-4, AVI and WMV.

Cheapest way to playback sound effects

Once I worked on a similar project, this article would have been very useful!!!!

One thing that I did learn while trying to play back audio is that the cheapest way to do this is to convert the wave file to 1-bit format. This might not give the best quality but you can play it back on the cheapest microcontrollers just by controlling an I/O pin. There is a free program made by Roman Black that will convert a wave file for you. (http://www.romanblack.com/picsound.htm) I thought that this technique would have horrible sound quality but it was actually reasonable, it was sufficient for my application.

Cheapest way to playback sound effects

Hey, thank you for the suggestion!
Unfortunately, I was not really aware of the 1-bit format at the time of making this project.

But I had a more thorough look at it, now that I do. I think, though, I might have needed a faster micro. My sampling frequency would have increased anyway,because now it is determined by how fast I can send a byte via SPI. If I used the 1-bit format it would have been determined by how fats I would have been able to toggle a micro pin.

And I would have had to ditch the DAC, which would have saved me cost on the BOM (the cost of components was a big thing in this project)

So, thanks again John!

Cristian

RE: Cool project, Going to give it a try....

Rich,
The application available is only to be used with a 16bit wav file. I believe at the time I opted for such a file input format because I did not like the way my audio editor used to scale samples down from 16 to 8 bits.. In case you want to use it with a 8 bits coded wav file...I would say the behavior of the application is...unpredictable. I am surprised actually that it did not crash...

Regarding to the way you output to a 12-bit DAC, it is not complicated: you perform a 4-bit shift inside the microcontroller and the 4 LSBs that you send to the micro will always be 0000.

As an example: if the sample stored in the memory is 0xB5, you will send to the DAC two 8-bit values: the first one would be 0x0B and the second would be 0x50. The first 4 bits of 0x0B are ignored by the DAC.

Hope this helps,
Cristian

RE: thanks for the shareware bud

Hey man,
Glad you will find it useful; please let me know how you get along with it. Please note my comment above and take care to use the application with a 16-bit wav file input, otherwise the output will not be as expected.

Best regards,
Cristian

Great information

Hi Cristian,

Thanks for sharing this. It does help in clearing some of my doubts with my similar project.

However, I failed to use your C# program to convert the Wave file to a C array though I follow your instructions. I found that some of the samples are actually negative numbers with '-' sign after doing the conversion using your program. Take the C:\Windows\Media\tada.wav or the http://www.nch.com.au/acm/11k16bitpcm.wav as an example.

Also, there is no scaling down from 16 bit to 8 bit as far as I understand from your program source code.

Appreciate if you can explain on this.

Thanks.

Eric

Who's online

There are currently 0 users and 20 guests online.

Recent comments