![]() |
Department of Computing and Department of Electrical and Electronic Engineering Shanawaz Basith Email : sab@doc.ic.ac.uk 2nd June 1996 |
This article looks at the MPEG-1 and MPEG-2 standards. The compression technology is explained and the data structures involved are described. The many applications of this technology are discussed.
1.Introduction
2.MPEG Compression
3.Applications of MPEG
4.Conclusions
5.References and Related Articles
At the end of the eighties the audio and video industry faced the prospect
of saturated markets and over capacity. What was required were new products
and services that would capture consumers' imagination. The data capacity
of existing digital storage and their transmission links limited its
potential for further exploitation without a generic coding scheme for
video and its associated audio.
The MPEG (Moving Pictures Experts Group) committee began life in late 1988
by the hand of Leonardo Chairigloione and Hiroshi Yasuda with the immediate
goal of standardising video and audio for compact discs.
A meeting between the International Standards Organisation (ISO) and the
International Electrotechnical Commission (IEC) in 1992 resulted in a standard
for audio and video coding, known as MPEG-1.
MPEG-2 became a bone fide standard in 1994 after a five day meeting of
ISO and ITC in Singapore. The original application for MPEG-2 was all digital
transmission of broadcast TV quality video, but now includes High Definition
Television (HDTV). HDTV applications were to be covered by the MPEG-3 standard, but
it was discovered that with some fine-tuning MPEG-2 could be used for this
purpose and subsequently MPEG-3 was dropped.
This document aims to discuss MPEG-1 and 2. The companion document looks at the
emerging MHEG standard.
The MPEG system consists of two layers :
I shall be focussing on the compression layer. Also the technology behind MPEG-1
and MPEG-2 are inherently the same and later some differences will be identified.
A generalised decoding system for the audio and video streams is shown in figure 2.1.
The system decoder extracts the timing information from the MPEG system stream and sends it to
the other system components. The decoder also demultiplexes the video and audio
streams from the system stream and passes it onto the appropriate decoder. The video and
audio decoders decompress the information as specified in parts 2
and 3 of the MPEG standard respectively[2].
Figure 2.1 - General MPEG Decoding System
Figure 2.2 - MPEG Data Hierarchy
Video Hierarchy
The standard defines a hierarchy of data structures in the video stream (see figure 2.2).
Begins with a sequence header (and optionally other sequence headers), includes one or more
groups of pictures and ends with an end-of-sequence-code.
Group Of Pictures (GOP)
This consists of a header and a series of one or more pictures intended to allow random access
into the sequence.
Picture
This is the primary coding unit of a video sequence. A picture consists of three rectangular
matrices representing luminance/brightness (Y) and two chrominance (Cb and Cr or U and V)
values. Think about YCbCr as equivalent to the RGB representation. The conversion can be
can be done by a linear transformation as shown in the equation below.
These are one or more adjacent macroblocks. The order of the macroblocks within a slice is from
left-to-right and top-to-bottom. Slices are important for handling errors. If the
bitstream contains an error, the decoder can skip to the start of the next slice.
Having more slices in the bitstream allows better error hiding, but use space
that could otherwise be used to improve picture quality.
Macroblock
A macroblock is built from a 2 by 2 matrix of blocks.
Block
A block is an 8-pixel by 8-line set. More technical information may be found in
one of the references.
Audio Hierarchy
The MPEG standard defines a hierarchy of data structures that accept, decode
and produce digital audio output. The audio stream, like the video stream
consists of a series of packets. See figure 2.3. More information can
be found in one of references given below.
Figure 2.3 - Audio Stream Structure
Much of the information in a picture within a video sequence is similar to
information in a previous or subsequent picture. The MPEG standard takes
advantage of this temporal redundancy by representing some pictures in
terms of their differences from other (reference) pictures. This is called
inter-picture coding.
Picture Types
The MPEG standard specifically defines three types of pictures (frames) :
Lets consider these in turn.
Intra Pictures
Intra (or I) pictures, are coded using information only found in the picture
itself. I-frames provide potential random access points into the compressed
video data. I-frames only use transform coding and provide moderate
compression ratios, typically 2-bits per coded pixel.
Predicted Pictures
Predicted (or P) pictures are coded with respect to the nearest previous
I or P-frame. This technique is called forward prediction (see figure
2.4). Like I-frames, P-frames serve as a prediction reference for B-frames
and future P-frames. However, P-frames use motion compensation to get a
higher compression ratio then is possible for I-frames. Unlike I-frames,
P-frames can propagate coding errors, because P-frames are predicted from
previous reference (P or I) frames.
Figure 2.4 - Forward Prediction
Bidirectional Pictures
Bidirectional (or B) pictures use both a past and future picture as a
reference. This technique is called bidirectional prediction as shown
in figure 2.5. B-frames provide the most compression and do not
propagate errors because they are never used as a reference. Bidirectional
prediction also reduces the effects of noise by averaging two pictures.
Figure 2.5 - Bidirectional Prediction
The MPEG algorithm allows the encoder to choose the frequency and location
of I-frames. This choice is dependent on the applications need for random
access and the location of scene cuts in the video sequence. In applications
where random access is important, I-frames are used two times a second.
The encoder also chooses the number of B-frames between any pair of
reference (I or P) frames. This choice is based on factors such as the
amount of memory in the encoder and the characteristics of the material
being encoded. A typical display order of frames is shown in figure 2.6.
The MPEG encoder re-orders pictures in the video stream to present the pictures
to the decoder in the most efficient sequence. In particular, the reference
pictures needed to reconstruct B-frames are sent before the associated
B-frames. Figure 2.7 shows this ordering for the first section of the sequence
in the example above.
Figure 2.6 - Typical Ordering of Frames
Figure 2.7 - Video Stream and Display Ordering
Motion compensation is a technique for enhancing the compression of P and
B-frames by eliminating temporal redundancy. Motion compensation typically
improves compression by a factor of three compared to intra-picture
coding. Motion compensation algorithms work at a macroblock level. When a
macroblock is compressed by motion compensation, the compressed file contains
the spatial vector between the reference macroblock and the macroblock being
coded (motion vectors) and the content difference between the reference and
coded macroblocks (error terms).
Not all information in a picture can be predicted from a previous picture,
e.g a scene where a door opens the visual contents of the new room cannot
be predicted from the previous room. If this situation arises then
the macroblock is coded using the transform technique.
The difference between B and P-frame motion compensation is that macroblocks
in a P-frame use the previous reference only, while macroblocks in a B-frame
are coded using any combination of previous or future reference picture.
This implies that four codings are possible for each macroblock in a B-frame :
Backward prediction can be used to predict uncovered areas, that do not
appear in previous pictures.
The MPEG transform coding algorithm includes these steps :
Both image blocks and prediction blocks have high spatial redundancy
(e.g a single colour frame). To reduce this, the MPEG algorithm
transforms 8 by 8 blocks of pixels or 8 by 8 blocks of error terms
form the spatial domain to the frequency domain with the aid of the
Discrete Cosine Transform (DCT). A two dimensional DCT can be
performed by performing a one-dimensional DCT on the columns and
then a one-dimensional DCT on the rows and combining.
Next the algorithm quantises the frequency coefficients. Quantisation is
the process of approximating each frequency coefficient as one of a
limited number of allowed values. The encoder chooses a quantisation
matrix that determines how each frequency coefficient in the 8 by 8 block
is quantised. Human perception of quantisation error is lower for high
spatial frequencies (colour), so high frequencies are typically quantised
more coarsely (with fewer allowed values).
Figure 2.8 - Transform Coding Operations
The combination of the DCT and quantisation results in many of the frequency
coefficients being zero, especially those at high spatial frequencies. To
take maximum advantage of this, the coefficients are organised in a zig-zag
order to produce long runs of zeros (see figure 2.8). The
coefficients are then converted to a series of run-amplitude pairs, each
pair indicating a number of zero coefficients and the amplitude of a
non-zero coefficient. These run-amplitude are then coded with a variable-
length code, which uses shorter coders for commonly occuring pairs and
longer codes for less common pairs.
Some blocks need to be more accurately coded then others, e.g blocks
with a smooth intensity gradient. To deal with this inequality between
blocks, the MPEG algorithm allows the amount of quantisation to be
modified for each macroblock of pixels.
The video kiosks or information kiosks, are a new opportunity for the use of
video. Shops, car dealerships and banks are finding that automated
information kiosks are a way to increase sales. Theses came about due to the
addition of professional quality video found in MPEG-1. Information
that was once laboriously displayed as slides can be brought to life with
video. Using MPEG-1 and a standard hard-disc or CD-ROM, the developer can
easily update their kiosk information on a regular basis. Advanced
kiosk features become possible due the advent of friendly, personal help
video tailored to the needs of each user. New applications are only
limited by the imagination of the developer.
Video on Demand
Video on Demand (VOD) envelopes nearly all video based applications. However,
the most common application of VOD is movies on demand. Initially in hotels
and hospitals and eventually in our homes. All of us will have an interactive
television set from which we can order movies on demand, at any given time.
The missing ingredient for home use is low prices, interactive decoders
(CD-I was one attempt). Given that this application is also considering the
MPEG-2 standard, VOD to the home appears years away from a large scale
implementation.
Video Dial Tone
The telephone and cable companies are preparing systems that will allow us to
order our movies through the existing telephone infrastructure. Given the
limited bandwidth of today's telephone lines. MPEG-1 becomes the ideal
choice. Numerous pilot schemes are set up in early 1994 and are still
being trial-ed. This application also has ramifications to the
telecommunications industry and the corporate presentation market since
very high quality presentations can now be produced and distributed
afford-ably across standard telephone lines.
Training
The training market has historically used laser disc players to deliver high
quality video. MPEG-1 is an ideal replacement for the analogue laser disc
player. The advantages of MPEG are lower costs, ease of delivery, ease
of updating and networking capability. The training market is a large
user of video equipment and MPEG is considered a main stream product for
this application.
Corporate Presentations
The presentation market evolved from 35mm slides to overheads to computer
generated slide shows. As presentation software packages evolve, they
are now beginning to support video. MPEG is a natural choice due to small
file size, extremely high quality and ease of integration into existing
presentation programs. Also almost all conference rooms now include a VCR and
a television as well as a computer. Another presentation media that
is growing in the U.S as well as Europe is Video-CD. This allows you to create
a presentation of graphics with hot-spots or buttons and you can also
include MPEG video.
Video Library
Organisations storing massive quantities of video cassettes for occassional
playback, can benefit by encoding their existing and new material. Storing
the MPEG files on a digital library video server allows long-term storage
and multiple playback without any quality degradation, fast random access
retrieval and multi-point playback.
Museums, large libraries, government agencies and news agencies using video
footage, are now converting to digital video.
DBS (Direct Broadcast Satellite)
This will use MPEG-2 audio and video for direct broadcast. DBS is a scheme
of anywhere, anytime broadcasting. More information may be found at
Alta Vista.
HDTV (High Definition Television)
A U.S consortium has already agreed to use MPEG-2. Refer to the related articles for
further information.
Other Applications
Other applications include : Digital video tape; High Density CD; Video Conferencing and Digital Camcorders.
References
1. Introduction
2. MPEG Compression
The job of MPEG is to take analogue or digital video signals and convert
them to packets of digital data that are more efficiently transported over
a network. Being digital it has the following advantages :
MPEG is derived from the original work by the Joint Pictures Expert Group
(JPEG). The JPEG standard is for still images and is a lossy technique. It
takes advantage of the nature of the human eye and removes redundant
information that we just do not see.
Decoding


Data Hierarchy
Video Sequence


Inter-Picture Coding


Video Stream Composition


Motion Compensation
Intra-Picture (Transform) Coding

Differences between MPEG-1 and MPEG-2
The above list is by no means complete and you should refer to one of the
references given below for a complete list.
3. Applications of MPEG
MPEG-1
Video Kiosk
MPEG-2
CATV (Cable Television)
CATV will use MPEG as the standard for compressing and decompressing video
for distribution and for broadcasting. The need is perfect-quality video and
the bandwidth is available to handle high bit rates. Because of this the
industry has settled on MPEG-2 video although some are still using MPEG-1
on the interior.
5. Conclusions
A lot of details have had to be left out, due to the size and nature of this report.
This means that I did not explain some things in detail. However, this document should
still serve as a good foundation on which to base further work.
5. References and Related Articles
| Reference | Author | Score(/10) |
|---|---|---|
| Tristan's MPEG Pages | Tristan Savatier (1996) | 9 |
| MPEG-2 Digital Video Technology and Testing BSTS Solution Note 5963-7511E | Dragos Ruiu (Hewlett Packard - 1995) | 8 |
| Net Search | Netscape (1996) | 5 |
| Unleashing a Broadcasting Revolution | Roy Rubenstein (New Electronics on Campus Autumn 1995) | 8 |
| Multimedia Design Reaches a Higher Level | David Thon (New Electronics on Campus Autumn 1995) | 8 |
| Never Mind the Quality, Look at the Quantity | David Boothroyd (New Electronics on Campus Autumn 1995) | 7 |
Related Articles
| Article | Author(s) |
|---|---|
| Interactive Television | Keval Pindoria (khp1) and Gerald Wong Ping Hung (phgw) |
| Technology and Clinical Applications | Amere Oakman (ao2) and Constantine Prouskas (cbp) |
| MPEG Image Compression and ATM Networks | Arran Derbyshire (arad) and Chandrarath Kulanthai (ck4) |
| MHEG - A Multimedia Presentation Standard | Stephen Done (srd2 |
Further Reading
Due to the limitations of this article a vast amount of information has had to be left out. You many want to consult these references for further information.
| 1. | J. R. Allen et al., ``VCTV: A Video-on-Demand Market Test,'' AT&T Technical Journal, Vol. 72, No. 1, January/February 1993, pp. 7-14. |
| 2. | ISO Committee Draft 11544, Coded representation of picture and audio information -- Progressive bi-level image compression, ISO/IEC IS, 11544 |
| 3. | Horst Hampel et al., ``Technical features of the JBIG standard for progressive bi-level image compression,'' Signal Processing: Image Communication, Vol. 4, No. 2, April 1992, pp. 103-110. |
| 4. | R. Hunter and A. H. Robinson, ``International digital facsimile coding standards,'' Proceedings of the IEEE, Vol. 68, No. 7, July 1980, pp. 854-867. |
| 5. | CCITT Recommendation T.4, Standardisation of Group 3 facsimile apparatus for document transmission, Geneva, 1980. |
| 6. | CCITT Recommendation T.6, Facsimile coding schemes and coding control functions for Group 4 facsimile apparatus, Malaga&endashTorremolinos, 1984. |
| 7. | ISO Committee Draft 10918-1, Digital compression and coding of continuous-tone still images -- Part 1: Requirements and guidelines, ISO/IEC DIS 10918-1, 1991. |
| 8. | R. W. Hamming, Coding and Information Theory, Prentice-Hall, Englewood Cliffs, New Jersey, 1980, pp. 96-98. |
| 9. | A. S. Tannenbaum, Computer Networks, Prentice-Hall, Inc., Englewood Cliffs, New Jersey, 1981. |
| 10. | N. Abramson, Information Theory and Coding, McGraw-Hill, New York, N. Y., 1963, pp. 61-62. |
| 11. | Digital Compression and Coding of Continuous-Tone Still Images, Part 2: Compliance Testing, ISO/IEC CD 10918-2, 1991. |
| 12. | A. N. Netravali and B. G. Haskell, Digital Pictures: Representation and Compression, Plenum Press, New York, 1988. |
| 13. | D. A. Huffman, ``A Method for the Construction of Minimum-Redundancy Codes,'' Proc. IRE, No. 40, September 1952, pp. 1098-1101. |
| 14. | J. Amsterdam, ``Data Compression with Huffman Coding,'' BYTE, Vol. 11, No. 5, May 1986, pp. 99-108. |
| 15. | G. G. Langdon, Jr., ``An Introduction to Arithmetic Coding,'' IBM J. Res. Develop., Vol. 28, No. 2, March 1984, pp. 135-149. |
| 16. | I. H. Witten, R. M. Neal, and J. G. Cleary, ``Arithmetic Coding for Data Compression,'' Communications of the ACM, Vol. 30, No. 6, June 1987, pp. 520-540. |
| 17. | N. Ahmed, T. Natarajan, and K. R. Rao, ``Discrete Cosine Transform,'' IEEE Transactions on Computers, Vol. C-23, No. 1, January 1974, pp. 90-93. |
| 18. | R. J. Clarke, Transform Coding of Images, Academic Press, Orlando, Florida, 1985. |
| 19. | H. Lohscheller, ``A subjectively adapted image communication system,'' IEEE Transactions on Communications, Vol. COM-32, December 1984, pp. 1316-1322. |
| 20. | M. Bierling and R. Thoma, ``Motion Compensating Field Interpolation Using a Hierarchically Structured Displacement Estimator,'' Signal Processing, Vol. 11, No. 4, Dec. 1986, pp. 387-404. |
| 21. | ISO Committee Draft 11172, Information Technology-Coding of moving pictures and associated audio for digital storage media up to about 1.5 Mbit/s, 1993 |
| 22. | D. J. LeGall, ``MPEG: A Video Compression Standard for Multimedia Applications,'' Communications of the ACM, Vol. 34, No. 4, April 1991, pp. 47-58. |
| 23. | R. K. Jurgen, ``Digital Video,'' IEEE Spectrum, Vol. 29, No. 3, March 1992, pp. 24&endash30. |
| 24. | A. Puri, ``Video Coding Using the MPEG-1 Compression Standard,'' Proc. International Symposium: Society for Information Display, Boston, Massachusetts, May 1992, pp. 123-126. |
| 25. | A. Puri and R. Aravind, ``Motion-Compensated Video Coding with Adaptive Perceptual Quantisation,'' IEEE Transactions on Circuits and Systems for Video Technology, Vol. CSVT-1, December 1991, pp. 351-361. |
| 26. | W. Lee, R.J. Gove, C.J. Read, and Y. Kim, ``UWGSP5: A Highly-Integrated Multimedia System,'' accepted for IEEE Multimedia Magazine, April 1994. |
| 27. | W. Lee, J. Golston, R.J. Gove, and Y. Kim, ``Real-time MPEG Video Codec on a Single-chip Multiprocessor,'' Digital Video Compression on Personal Computers: Algorithms and Technologies, Proc. SPIE, Vol. 2187, Feb 1994 |
| 28. | K. Guttag, R.J. Gove, and J.R. Van Aken, ``A Single-Chip Multiprocessor For Multimedia: The MVP,'' IEEE Computer Graphics & Applications, Nov 1992, pp. 53--64 |
| 29. | R.J. Gove, ``MVP: A Highly-Integrated Video Compression Chip,'' Proc. of Data Compression Conference, March, 1994. |
| 30. | Didier J. Le Gall, "The MPEG video compression algorithm," Signal Processing: Image Communication, Vol. 4, No. 2, April 1992, pp.129-140. |