Slide #1.

Image Compression Roger Cheng (JPEG slides courtesy of Brian Bailey) Spring 2007
More slides like this


Slide #2.

Ubiquitous use of digital images
More slides like this


Slide #3.

Goal  Store image data as efficiently as possible  Ideally, want to – – Maximize image quality Minimize storage space and processing resources  Can’t have best of both worlds  What are some good compromises?
More slides like this


Slide #4.

Digital vs. Analog  Modern image processing is done in digital domain  If we have an analog source image, convert it to digital format before doing any processing
More slides like this


Slide #5.

Two main schools of image compression  Lossless – – – – Stored image data can reproduce original image exactly Takes more storage space Uses entropy coding only (or none at all) Examples: BMP, TIFF, GIF  Lossy – – – – Stored image data can reproduce something that looks “close” to the original image Uses both quantization and entropy coding Usually involves transform into frequency or other domain Examples: JPEG, JPEG2000
More slides like this


Slide #6.

BMP (Bitmap)       Use 3 bytes per pixel, one each for R, G, and B Can represent up to 224 = 16.7 million colors No entropy coding File size in bytes = 3*length*height, which can be very large Can use fewer than 8 bits per color, but you need to store the color palette Performs well with ZIP, RAR, etc.
More slides like this


Slide #7.

GIF (Graphics Interchange Format)  Can use up to 256 colors from 24-bit RGB color space –   If source image contains more than 256 colors, need to reprocess image to fewer colors Suitable for simpler images such as logos and textual graphics, not so much for photographs Uses LZW lossless data compression
More slides like this


Slide #8.

JPEG (Joint Photographic Experts Group)    Most dominant image format today Typical file size is about 10% of that of BMP (can vary depending on quality settings) Unlike GIF, JPEG is suitable for photographs, not so much for logos and textual graphics
More slides like this


Slide #9.

JPEG Encoding Steps  Preprocess  Apply 2D forward DCT  Quantize  Apply image DCT coefficients RLE, then entropy encoding
More slides like this


Slide #10.

JPEG Block Diagram 8x8 blocks Source Image B G R DCT-based encoding FDCT Quantizer Entropy Encoder Table Table Compressed image data
More slides like this


Slide #11.

Preprocess  Shift – – values [0, 2P - 1] to [-2P-1, 2P-1 - 1] e.g. if (P=8), shift [0, 255] to [-127, 127] DCT requires range be centered around 0  Segment each component into 8x8 blocks  Interleave – components (or not) may be sampled at different rates
More slides like this


Slide #12.

Interleaving  Non-interleaved: scan from left to right, top to bottom for each color component  Interleaved: compute one “unit” from each color component, then repeat – – full color pixels after each step of decoding but components may have different resolution
More slides like this


Slide #13.

Color Transformation (optional)  Down-sample – – compress without loss of quality (color space) e.g., YUV 4:2:2 or 4:1:1  Example: – – – chrominance components 640 x 480 RGB to YUV 4:1:1 Y is 640x480 U is 160x120 V is 160x120
More slides like this


Slide #14.

Interleaving  ith color component has dimension (xi, yi) – –  maximum dimension value is 216 [X, Y] where X=max(xi) and Y=max(yi) Sampling among components must be integral – Hi and Vi; must be within range [1, 4] – [Hmax, Vmax] where Hmax=max(Hi) and Vmax=max(Vi)  xi = X * Hi / Hmax  yi = Y * Vi / Vmax
More slides like this


Slide #15.

Example [Wallace, 1991]
More slides like this


Slide #16.

Forward DCT  Convert – – from spatial to frequency domain convert intensity function into weighted sum of periodic basis (cosine) functions identify bands of spectral information that can be thrown away without loss of quality  Intensity values in each color plane often change slowly
More slides like this


Slide #17.

Understanding DCT  For example, in R3, we can write (5, 2, 9) as the sum of a set of basis vectors – we know that [(1,0,0), (0,1,0), (0,0,1)] provides one set of basis functions in R3 (5,2,9) = 5*(1,0,0) + 2*(0,1,0) + 9*(0,0,1)  DCT is same process in function domain
More slides like this


Slide #18.

DCT Basic Functions  Decompose the intensity function into a weighted sum of cosine basis functions
More slides like this


Slide #19.

Alternative Visualization
More slides like this


Slide #20.

1D Forward DCT  Given a list of n intensity values I(x), where x = 0, …, N-1  Compute the N DCT coefficients: 2 F (u )  n where n 1 C (u ) I ( x ) cos x 0  1  C (u )  2  1  for (2 x 1)  , u 0...n  1 2n u 0, otherwise
More slides like this


Slide #21.

1D Inverse DCT  Given a list of n DCT coefficients F(u), where u = 0, …, n-1  Compute the n intensity values: 2 I ( x)  n where n 1  F (u )C (u ) cos u 0  1  C (u )  2  1  for ( 2 x 1)  , x 0...n  1 2n u 0, otherwise
More slides like this


Slide #22.

Extend DCT from 1D to 2D  Perform 1D DCT on each row of the block  Again for each column of Y 1D coefficients – alternatively, transpose the matrix and perform DCT on the rows X
More slides like this


Slide #23.

Equations for 2D DCT  Forward DCT: m 1 n 1 2  (2 x  1)u F (u, v)  C (u )C (v) I ( x, y ) * cos 2n nm  y 0 x 0    (2 y  1)v  * cos    2 m    Inverse DCT: 2 m 1 n 1  (2 x  1)u   (2 y  1)v  I ( y, x)  F (v, u )C (u )C (v) cos  * cos   2n 2m nm v 0 u 0    
More slides like this


Slide #24.

Visualization of Basis Functions Increasing frequency Increasing frequency
More slides like this


Slide #25.

Quantization  Divide each coefficient by integer [1, 255] – –  In the decoding process, multiply the quantized coefficients by the inverse of the table – –  comes in the form of a table, same size as a block multiply the block of coefficients by the table, round result to nearest integer get back a number close to original error is less than 1/2 of the quantization number Larger quantization numbers cause more loss
More slides like this


Slide #26.

De facto Quantization Table Eye becomes less sensitive 16 11 10 16 24 40 51 61 12 12 14 19 26 58 60 55 14 13 16 24 40 57 69 56 14 17 22 29 51 87 80 62 18 22 37 56 68 109 103 77 24 35 55 64 81 104 113 92 49 64 78 87 103 121 120 101 72 92 95 98 112 100 103 99 Eye becomes less sensitive
More slides like this


Slide #27.

Entropy Encoding  Compress sequence of quantized DC and AC coefficients from quantization step – further increase compression, without loss  Separate – DC from AC components DC components change slowly, thus will be encoded using difference encoding
More slides like this


Slide #28.

DC Encoding  DC – – represents average intensity of a block encode using difference encoding scheme use 3x3 pattern of blocks  Because difference tends to be near zero, can use less bits in the encoding – – categorize difference into difference classes send the index of the difference class, followed by bits representing the difference
More slides like this


Slide #29.

AC Encoding  Use – – zig-zag ordering of coefficients orders frequency components from low->high produce maximal series of 0s at the end  Apply RLE to ordering
More slides like this


Slide #30.

Huffman Encoding  Sequence of DC difference indices and values along with RLE of AC coefficients  Apply Huffman encoding to sequence – Exploits sequence’s statistics by assigning frequently used symbols fewer bits than rare symbols  Attach appropriate headers  Finally have the JPEG image!
More slides like this


Slide #31.

JPEG Decoding Steps  Basically reverse steps performed in encoding process – – – – Parse compressed image data and perform Huffman decoding to get RLE symbols Undo RLE to get DCT coefficient matrix Multiply by the quantization matrix Take 2-D inverse DCT of this matrix to get reconstructed image data
More slides like this


Slide #32.

Reconstruction Error  Resulting image is “close” to original image  Usually measure “closeness” with MSE (Mean Squared Error) and PSNR (Peak Signal to Noise Ratio) – Want low MSE and high PSNR
More slides like this


Slide #33.

Example - One everyday photo with file size of 2.76 MB
More slides like this


Slide #34.

Example - One everyday photo with file size of 600 KB
More slides like this


Slide #35.

Example - One everyday photo with file size of 350 KB
More slides like this


Slide #36.

Example - One everyday photo with file size of 240 KB
More slides like this


Slide #37.

Example - One everyday photo with file size of 144 KB
More slides like this


Slide #38.

Example - One everyday photo with file size of 88 KB
More slides like this


Slide #39.

Analysis    Near perfect image at 2.76M, so-so image at 88K Sharpness decreases as file size decreases False contours visible at 144K and 88K –  Can be fixed by dithering image before quantization Which file size is the best? – – No correct answer to this question Answer depends upon how strict we are about image quality, what purpose image is to be used for, and the resources available
More slides like this


Slide #40.

Conclusion  Image compression is important  Image compression has come a long way  Image compression is nearly mature, but there is always room for improvement
More slides like this