This is a follow up of the previous blog: JPEG Standard, A Tutorial Based on Analysis of Sample Picture – Part 1. Coding of a 8X8 Block.

The sample jpeg image used for analysis in this tutorial is below,

sample

Figure 1. Sample JPEG Image for Analysis of JPEG File

The most common name for a jpeg image file is .jpg and .jpeg. A JPEG file consists of many segments, each begin with a marker. A marker contains two or more bytes. The first byte is 0xFF, the second byte indicates what marker it is. The optional length bytes indicates the size of the payload data of the marker (including the length bytes, excluding the first two marker bytes.) In case the marker payload data doesn’t align with byte boundary, the left bits are set to 1.

SOI

Given the sample image, the first two bytes are:  (You can save the sample image and view it using a hex editor.)

ff d8: it’s the SOI (Start Of Image) marker. As its name suggests, it indicates the start of jpeg image file. This marker has no payload data.

APPn

The next two bytes are:

ff e0: all marker in ff En (It’s called APPn marker) form indicates application specific section. It means some metadata follows.

The next two bytes indicates the length of the payload for the marker:

00 10: It means the data are 16 bytes, including 00 10. See below for the rest of 16 bytes.

4a 46 49 46 00 01 01 01 00 60 00 60 00 00

DQT

The next two bytes starts a new marker:

ff db: it’s the DQT (Define Quantization Table). It is follows by one or more quantization tables.

In the sample image, the following bytes are:

00 43: it indicates the payload data is 67 bytes (including 00 43).

01: quantization table info.  Bit 0..3: QT number. Bit 4..7: QT precision.

Then the quantization table:

02 02 02 03 03 03 06 03

03 06 0c 08 07 08 0c 0c

0c 0c 0c 0c 0c 0c 0c 0c

0c 0c 0c 0c 0c 0c 0c 0c

0c 0c 0c 0c 0c 0c 0c 0c

0c 0c 0c 0c 0c 0c 0c 0c

0c 0c 0c 0c 0c 0c 0c 0c

0c 0c 0c 0c 0c 0c 0c 0c

 

SOF0

The next two bytes:

ff c0: SOF0 (Start of Frame, Baseline DCT). Indicates the image is a baseline DCT-based JPEG image.

The bytes followed,

00 11: 17 bytes of data. The rest of 15 bytes are:

08 01 20 01 ba 03 01 22 00 02 11 01 03 11 01

08: 8 bits per sample. JPEG also specifies 12 bits and 16 bits per sample. But most of the jpeg image will be 8 bits per sample.

01 20: 288. The height of the image.

01 ba: 442. The width of the image.

03: number of components. Gray image will be one. RGB or YCbCr image will be 3.

For every component, there’ll be 3 bytes.

01 22 00: 01, component id; 22, component frequency, 0..3 bits (2) for vertical, 4..7 bits (2) for horizontal; 00, quantization table number.

02 11 01: 02, component id; 11, component frequency, 1 for vertical, 1 for horizontal; 01, quantization table number.

03 11 01: 03, component id; 11, component frequency, 1 for vertical, 1 for horizontal; 01, quantization table number.

Note: that’s why YCbCr has the sample ratio of 4 (2+2) : 2 (1+1) : 2 (1+1).

DHT

The next marker:

ff c4: DHT (Define Huffman Table). It specifies one or more Huffman tables.

The bytes that follows,

00 1f: 31 bytes.

00: HT info. 0..3 bits: HT number. 4th bit: HT type, 0 for DC table, 1 for AC table. 5..7 bits: must to 0.

Then there’re 16 bytes:

00 01 05 01 01 01 01 01 01 00 00 00 00 00 00 00: each byte represent the number of bytes for huffman code of length a particular length. For example, 00 means there’s no bytes for huffman code of length 1.

As 00 + 01 + 05 + 01 + …. + 00 = 12

The 12 bytes that follows are:

00 01 02 03 04 05 06 07 08 09 0a 0b

The next 3 markers are all DHT. With length of 181 bytes, 31 bytes and 181 bytes respectively.

SOS

After these three DHT markers, the next marker is:

ff  da: SOS (Start of Scan). Begins a top-to-bottom scan of the image. In baseline JPEG, there’s usually a single scan.

The bytes that follows,

00 0c: 12 bytes. With the rest of  the 10 bytes as below,

03 01 00 02 11 03 11 00 3f 00

03: number of components in the scan. Normally is 3 for color image.

01 00: component id, 01; Huffman table used: bits 0..3: AC table, here is 0. bits 4..7: DC table, which is 0.

02 11: component id, 02; Huffman table used: bits 0..3: AC table, here is 1. bits 4..7: DC table, which is 1.

03 11: component id, 03; Huffman table: AC table 1, DC table 1.

00 3f 00: ignored.

Entropy-encoded Data

The the entropy-encoded jpeg image data (check out part 1 for more detail) follow. One thing one needs to take note is for entropy data, there’s 0×00 byte follows any 0xff byte. This is to avoid the confusion with marker bytes. This technique is called byte stuffing.

EOI

The next marker is at the end of the jpeg file,

ff d9: EOI (End Of Image). It’s also the last two bytes of the jpeg file. As its name suggests, it indicates the end of the jpeg image.

There’re some other markers that are not used in the sample image, one can refer to reference part for more information.

Reference:

Wikipedia JPEG: http://en.wikipedia.org/wiki/JPEG

 

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Set your Twitter account name in your settings to use the TwitterBar Section.