WAVE Audio File Format

WAV (Waveform Audio File Format, also known as WAVE) is a commonly used audio file format for storing raw audio samples on Windows. It follows the RIFF (Resource Interchange File Format) generic format.

0. The Format

A wave file is a RIFF file with a “WAVE” chunk. The WAVE chunk consists of two subchunks, namely “fmt ” and “data”. (Note that there’s a space in “fmt ”). Below we list out the meanings of each field in a WAVE file.

field

length

endian

Note

Chunk Tag

4

big

“RIFF”

RIFF chunk descriptor

Chunk Size

4

little

type

——-

4

——

big

——

“WAVE”

———

subchunk id (fmt )

4

big

“fmt ”

fmt subchunk: format information about the audio data

subchunk size

4

little

audio format

2

little

1=>PCM, other values => data compressed

num of channels

2

little

1=> mono, 2=>stereo

sample rate

4

little

8000, 16000, 22050, 44100 etc.

byte rate

4

little

Sample rate*num of channels*bits per sample/8

block align

2

little

num of channels*bits per sample/8

bits per sample

——

2

 

——-

little

 

——

8=>8bits, 16=>16bits

 

———–

subchunk id (data)

4

big

“data”

Data subchunk: contains the raw audio data

subchunk size

4

little

Num of samples * num of channels * bits per sample/8

data

little

Audio data

Note that there’re totally 44 bytes (12 + 24 + 8) before the actual audio data.

1. Byte-by-Byte example

Below is a screenshot of a wave file shown in vim hex mode. We’ll go through the bytes one by one.

wave

Figure 1. Bytes of a wave file recorded on Android 

5249 4646: RIFF

74d8 0400: 04d874 = 317556 bytes. I used “ls -l test.wav” to get the file size as 317564 bytes, which is equal to 317556 + 4 (size field) + 4 (RIFF field).

5741 5645: WAVE

666d 7420: fmt<I’m a space>

1000 0000: 00000010 = 16 bytes.

0100: 0001, which corresponds to PCM, values other than 1 indicate data is compressed.

0100: 0001, only one channel.

44ac 0000: 0000 ac44 = 44100, the sample rate is 44100Hz.

8858 0100: 0001 5888 = 88200 = sample rate * number of channel * bits per sample/8 = 44100 * 1 * 16 / 8 = 44100 * 2

0200: 0002, block align. 02 = number of channel * bits per sample / 8 = 1 * 16 / 8 = 2

1000: 0010 = 16, bits per sample

6461 7461: data

50d8 0400: 0004 d850 = 317520, data size. 317520 + 44 (total header bytes) = 317564, which matches with the total file size obtained using “ls -al”.

References:

1. wikipedia page WAVE: http://en.wikipedia.org/wiki/WAV

2. WAVE PCM soundfile format: https://ccrma.stanford.edu/courses/422/projects/WaveFormat/

Leave a Reply

Your email address will not be published. Required fields are marked *