Record WAVE Audio on Android

This post discusses how to record raw audio (PCM) and save it to wave file on Android. If you’re not familiar with WAVE audio file format, please refer to a previous post, WAVE Audio File Format.

The post is a follow up post for Record PCM Audio on Android. The code and working principle are similar. It is strongly suggested you read it first.

WAVE file is used to store PCM data, with 44-byte header. Recording WAVE audio is equivalent to recording PCM audio and adding the 44-byte header in front.

We used a RandomAccessFile to write the data. We first write 44-byte header. Because some fields are not known until we finish the recording, we simply write zeros. This is shown as below.

randomAccessWriter = new RandomAccessFile(filePath, "rw");

randomAccessWriter.setLength(0); // Set file length to 0, to prevent unexpected behavior in case the file already existed


randomAccessWriter.writeInt(0); // Final file size not known yet, write 0 


randomAccessWriter.writeBytes("fmt ");

randomAccessWriter.writeInt(Integer.reverseBytes(16)); // Sub-chunk size, 16 for PCM

randomAccessWriter.writeShort(Short.reverseBytes((short) 1)); // AudioFormat, 1 for PCM

randomAccessWriter.writeShort(Short.reverseBytes(nChannels));// Number of channels, 1 for mono, 2 for stereo

randomAccessWriter.writeInt(Integer.reverseBytes(sRate)); // Sample rate

randomAccessWriter.writeInt(Integer.reverseBytes(sRate*nChannels*mBitsPersample/8)); // Byte rate, SampleRate*NumberOfChannels*mBitsPersample/8

randomAccessWriter.writeShort(Short.reverseBytes((short)(nChannels*mBitsPersample/8))); // Block align, NumberOfChannels*mBitsPersample/8

randomAccessWriter.writeShort(Short.reverseBytes(mBitsPersample)); // Bits per sample


randomAccessWriter.writeInt(0); // Data chunk size not known yet, write 0

We then write the PCM data. This is discussed in detail in post Record PCM Audio on Android.

After the recording is done, we seek to the header and update a few header fields. This is shown as below.


try {; // Write size to RIFF header

    randomAccessWriter.writeInt(Integer.reverseBytes(36+payloadSize));; // Write size to Subchunk2Size field




} catch(IOException e) {

    Log.e(WavAudioRecorder.class.getName(), "I/O exception occured while closing output file");

    state = State.ERROR;


For the complete source code, one can refer to my github Android tutorial project.

Record PCM Audio on Android

This post discusses how to record PCM audio on Android with the class. If you’re not familiar with PCM, please read a previous post PCM Audio Format.

For the source code, please refer to AndroidPCMRecorder.

1. State Transition

The source code records the PCM audio with, which uses Android AudioRecord internally. Similar to the Android MediaRecorder class, PcmAudioRecorder follows a simple state machine as shown below.

Untitled drawing

Figure 1. State Transition of PcmAudioRecorder Class

As indicated in the diagram, we initialize a PcmAudioRecorder class by either calling the getInstance static method or the constructor to get into INITIALIZING state. We can then set the output file path and call prepare to get into PERPARED state. We can then start recording by calling  start method and finally call stop to stop the recording. At any state except ERROR, we can call reset to get back to INITIALIZING state. When we’re done with recording, we can call release to discard the PcmAudioRecorder object.

2. Filling the Buffer with Data and Write to File

One particular part of the code requires a bit attention is the updateListener. We register the listener with the AudioRecord object using setRecordPositionUpdateListener method. The listener is an interface with two abstract methods, namely onMarkerReached and onPeriodicNotification. We implemented the onPeriodicNotification method to pull the audio data from AudioRecord object and save it to the output file.

In order for the listener to work, we need to specify the notification period by calling AudioRecord.setPositionNotificationPeriod(int) to specify how frequently we want the listener to be triggered and pull data. The method accepts a single argument, which indicates the update period in number of frames. This leads us to next section.

3. Frame vs Sample

For PCM audio, a frame consists of the set of samples from all channels at a given point of time. In other words, the number of frames in a second is equal to sample rate.

However, when the audio is compressed (encoded further to mp3, aac etc.), a frame consists of compressed data for a whole series of samples with additional, non-sample data. For such audio formats, the sample rate and sample size refer to data after decoded to PCM, and it’s completely different from frame rate and frame size.

In our sample code, we set the update period for setPositionNotificationPeriod as number of frames in every 100 millisecond, therefore the listener will be triggered every 100 milliseconds, and we can pull data and update the recording file every 100 milliseconds.

Note that source code is modified based on

WAVE Audio File Format

WAV (Waveform Audio File Format, also known as WAVE) is a commonly used audio file format for storing raw audio samples on Windows. It follows the RIFF (Resource Interchange File Format) generic format.

0. The Format

A wave file is a RIFF file with a “WAVE” chunk. The WAVE chunk consists of two subchunks, namely “fmt ” and “data”. (Note that there’s a space in “fmt ”). Below we list out the meanings of each field in a WAVE file.





Chunk Tag




RIFF chunk descriptor

Chunk Size











subchunk id (fmt )



“fmt ”

fmt subchunk: format information about the audio data

subchunk size



audio format



1=>PCM, other values => data compressed

num of channels



1=> mono, 2=>stereo

sample rate



8000, 16000, 22050, 44100 etc.

byte rate



Sample rate*num of channels*bits per sample/8

block align



num of channels*bits per sample/8

bits per sample








8=>8bits, 16=>16bits



subchunk id (data)




Data subchunk: contains the raw audio data

subchunk size



Num of samples * num of channels * bits per sample/8



Audio data

Note that there’re totally 44 bytes (12 + 24 + 8) before the actual audio data.

1. Byte-by-Byte example

Below is a screenshot of a wave file shown in vim hex mode. We’ll go through the bytes one by one.


Figure 1. Bytes of a wave file recorded on Android 

5249 4646: RIFF

74d8 0400: 04d874 = 317556 bytes. I used “ls -l test.wav” to get the file size as 317564 bytes, which is equal to 317556 + 4 (size field) + 4 (RIFF field).

5741 5645: WAVE

666d 7420: fmt<I’m a space>

1000 0000: 00000010 = 16 bytes.

0100: 0001, which corresponds to PCM, values other than 1 indicate data is compressed.

0100: 0001, only one channel.

44ac 0000: 0000 ac44 = 44100, the sample rate is 44100Hz.

8858 0100: 0001 5888 = 88200 = sample rate * number of channel * bits per sample/8 = 44100 * 1 * 16 / 8 = 44100 * 2

0200: 0002, block align. 02 = number of channel * bits per sample / 8 = 1 * 16 / 8 = 2

1000: 0010 = 16, bits per sample

6461 7461: data

50d8 0400: 0004 d850 = 317520, data size. 317520 + 44 (total header bytes) = 317564, which matches with the total file size obtained using “ls -al”.


1. wikipedia page WAVE:

2. WAVE PCM soundfile format:

PCM Audio Format

Pulse Code Modulation (PCM) is a method to represent sampled analog signals in digital form, which is the standard form for digital audio representation in computers. In order to convert an analog signal to PCM, two steps are required.

  • sampling: the magnitude of the analog signal are sampled regularly at uniform intervals.
  • quantization: the value of each sample is rounded to the nearest value expressible by the bits allowed for each sample.

Two Basic Properties

Two basic properties determines how well a PCM sequence can represent the original signal.

  • sampling rate: the number of samples taken in a second
  • bit depth: the number of bits used to represent each sample, which determines the number of values each sample can take (e.g. 8 bits => 2^8 = 256 values)

PCM Types

  • Linear PCM: The straightforward method of PCM. The samples are taken linearly and represented on a linear scale (as opposed to Logarithmic PCM etc.). It is an uncompressed format, which can be compressed by different audio codec. When we talk about PCM, we’re generally referring to Linear PCM.
  • Logarithmic PCM: the amplitudes of samples are represented in logarithmic form. There are two major variants of log PCM, mu-law (u-law) and A-law.
  • Differential PCM (DPCM): sample value is encoded as difference from its previous sample value. This could reduce number of bits required for an audio sample.
  • Adaptive DPCM (ADPCM): the size of quantization step is varied so that the required bandwidth can be further reduced for a given signal-to-noise ratio.

Audio File Formats Support LPCM

LPCM audio is usually stored in aiff (.aiff, .aif, .aifc), wav (.wav, .wave), au (.au, .snd), and raw (.raw, .pcm) audio files.