Home

 Biography

 Research

 H.264/AVC

 NVIDIA GPU

 IBM Cell BE

 


 


Introduction to H.264/AVC


For info of implementing H.264/AVC encoder on TMS320DM642, please click here.

H.264/AVC is the latest international standard for video coding, issued in May 2003. It was jointly developed by the ITU-T Video Coding Experts Group (VCEG) together with the ISO/IEC Moving Picture Experts Group (MPEG). The official name is Advanced Video Coding (AVC), a.k.a. H.264 or MPEG 4 Part 10. The standard defines the video bitstream and decoding method, allowing design flexibility for encoding process. Figure 1 briefly summarizes the history of H.26x and MPEG-x series video coding standards.


Figure 1. The history of video coding standards.

Compared to the other standards, H.264/AVC contains a number of new features, which not only offers lower bit rate and more efficient compression, but also provide more flexibility for application to a wide variety of network environments. As shown in Figure 2, H.264 consists of two layers, namely Video Coding Layer (VCL), and Network Abstraction Layer (NAL).


Figure 2. Two layers in H.264/AVC.

The goal of VCL is to encode the video independently from the network layer. The syntax supports a hierarchy of video data partition, varying from slice, macroblock, sub-block, and pixel, as shown in Figure 3. The NAL formats the VCL representation of the video and provides header information in a manner appropriate for conveyance by particular transport layers (such as Real Time Transport Protocol) or storage media. All data are contained in NAL units, each of which contains an integer number of bytes. A NAL unit specifies a generic format for use in both packet-oriented and bitstream systems.


Figure 3. Data hierarchy in Video Coding Layer.

Like the other video coding standards, H.264/AVC incorporates different profiles and levels. Profiles define sets of bit stream features a H.264 stream can use. Levels define restrictions on the video resolution, frame rate and some stuff called VBV (Video Buffer Verifier). There are up to 16 profiles and 16 levels in the current version. Three most commonly used profiles are baseline profile (BP), main profile (MP), and extended profile (EP), as shown in Figure 4.


Figure 4. Baseline, main and extended profiles in H.264.

The basic macroblock encoding structure is given in Figure 5. The main idea is to predict the frame in advance and encode the errors between the original frame and the predicted one. To obtain the predicted frame, motion estimation and motion compensation are adopted. For each block in current frame, best matching block is searched by computing the sum of absolute difference (SAD) within a predetermined window in previous frame. After finding the closest matching area (minimal SAD value), H.264 calculates offset between the current block and the reference block, also known as motion vector (MV). Following these MVs, H.264 re-builds a predicted frame by copying the reference blocks to the new positions. Then H.264 calculates the residual error between the predicted frame and the current frame, which will be entropy encoded into bit streams. The H.264 encoder has an implicit decoder inside, in order to be accordance with the decoder side on the reference frames. Because some part of the quantization and transform are lossy, the decoder reconstructs a frame which might be different from the one that encoder predicts. Therefore, after quantization in the encoder, there exists an inverse transform and a dequantization which guarantee the encoder and decoder use the same predicted frames.


Figure 5. H.264 encoding structure.

Figure 6 and Figure 7 show the comparisons between H.264, H.263, MPEG-2, and MPEG-4. The two sample videos tested are foreman (QCIF) and tempete (CIF). From the curves, we can easily find that H.264 has a higher quantity than the other three standards with a even lower bit rate.

     
Figure 6. Comparison to MPEG-2, H.263, MPEG-4 (QCIF) 


Figure 7. Comparison to MPEG-2, H.263, MPEG-4 (CIF)            
                                                                        
                        
Reference

1. Standard: H.264/AVC video coding standard, by ITU-T and ISO/IEC.
2. Paper: Overview of the H.264/AVC video coding standard, by Thomas Wiegand, Gary J. Sullivan, Gisle Bjontegaard, and Ajay Luthra.
3. All the figures above are taken from my previous slide, click here for download. This file contains Chinese text. Without proper rendering support, you may see question marks, boxes, or other symbols instead of Chinese characters. Be sure to install package for Chinese language support.






Last updated by Yang Song
August 26, 2009