Department of Computing and
Department of Electrical and Electronic Engineering
Email : firstname.lastname@example.org
24th May 1996
Starting from analogue video, the problems are described. Digital video is defined. The major factors affecting digital video are described. It is shown that compression is a vital requirement. The factors to be considered when choosing a compression method are discussed. It is shown that the discussion presented had to lead to a standard.
3.Defining Digital Video
4.Four Factors of Digital Video
5.Need for Compression
6.Factors Associated with Compression
7.Selecting a Compression Technique
9.References and Related Articles
This article presents an introduction to the fundamentals of digital video. The Moving Pictures Experts Group (MPEG) standard came about due to the problems and ideas presented here.
To understand digital video, we must first understand that there is a difference between video for broadcast television and video for personal computers. Broadcast professionals have, and will continue to, demand high quality video. Their efforts and requirements are responsible for many advancements in the technology of digital video. The definition of digital video for this group varies from the one that is meaningful to computer professionals.
Several methods exist for the transmission of video signals. The earliest of these was analogue. In an analogue video signal, each frame is represented by a fluctuating voltage signal. This is known as an analogue waveform. One of the earliest formats for this was composite video.
Composite analogue video has all its components (brightness, colour, synchronisation information, etc.) combined into one signal. Due to the compositing (or combining) of the video components, the quality of composite video is marginal at best. The results are colour bleeding, low clarity and high generational loss.
Composite video quickly gave way to component video, which takes the different components of the video and breaks them into separate signals. Improvements to component video have led to many video formats, including S-Video, RGB etc.
All of these are still analogue formats and are susceptible to loss due to transmission noise effects. Quality loss is also possible from one generation to another. This type of loss is like photocopying, in which a copy of a copy is never as good as the original.
These limitations led to the birth of digital video. Digital video is just a digital representation of the analogue video signal. Unlike analogue video that degrades in quality from one generation to the next, digital video does not. Each generation of digital video is identical to the parent.
Even though the data is digital, virtually all digital formats are still stored on sequential tapes. Although tape holds considerably more data then a computer hard drive, there are two significant advantages for using computers for digital video : the ability to random access the storage of video and to also compress the video stored. There is also the problem of transferring video from tape to computer.
Considering these issues, digital video for computers requires a different definition than for traditional digital formats. Computer-based digital video is defined as a series of individual images and associated audio. These elements are stored in a format in which both elements (pixel and sound sample) are represented as a series of binary digits, or bits.
Previous attempts were made to find the best procedure for capturing, storing, transmitting and playing back video from the computer desktop. Unfortunately these attempts were of a proprietary nature and resulted in various formats and incompatibilities.
As a result, the International Standards Organisation (ISO) worked to define the internationally accepted formats for digital video capture, storage, and playback.
With digital video, four factors have to be kept in mind. These are :
4.1 Frame Rate
The standard for displaying any type of non-film video is 30 frames per second (film is 24 frames per second). This means that the video is made up of 30 (or 24) pictures or frames for every second of video. Additionally these frames are split in half (odd lines and even lines), to form what are called fields.
Here again, there is a major difference between the way computers and television handle video. When a television set displays its analogue video signal, it displays the odd lines (the odd field) first. Then is displays the even lines (the even field). Each pair forms a frame and there are 60 of these fields displayed every second (or 30 frames per second). This is referred to as interlaced video.
A computer monitor, however, uses a process called "progressive scan" to update the screen. With this method, the screen is not broken into fields. Instead, the computer displays each line in sequence, from top to bottom. This entire frame is displayed 30 times every second. This is often called non-interlaced video.
4.2 Colour Resolution
This second factor is a bit more complex. Colour resolution refers to the number of colours displayed on the screen at one time. Computers deal with colour in an RGB (red-green-blue) format, while video uses a variety of formats. One of the most common video formats is called YUV. Although there is no direct correlation between RGB and YUV, they are similar in that they both have varying levels of colour depth (maximum number of colours).
4.3 Spatial Resolution
The third factor is spatial resolution - or in other words, "How big is the picture?". Since PC and Macintosh computers generally have resolutions in excess of 640 by 480, most people assume that this resolution ( VGA) is the video standard. It is not. As with RGB and YUV, there is no direct correlation between analogue video resolutions and computer display resolutions.
A standard analogue video signal displays a full, over scanned image without the borders common to computer screens. The National Television Standards Committee ( NTSC) standard used in North America and Japanese Television uses a 768 by 484 display. The Phase Alternative system (PAL) standard for European television is slightly larger at 768 by 576. Most countries endorse one or the other, but never both.
Since the resolution between analogue video and computers is different, conversion of analogue video to digital video at times must take this into account. This can often the result in the down-sizing of the video and the loss of some resolution.
4.4 Image Quality
The last, and most important factor is video quality. The final objective is video that looks acceptable for your application. For some this may be 1/4 screen, 15 frames per second (fps), at 8 bits per pixel. Other require a full screen (768 by 484), full frame rate video, at 24 bits per pixel (16.7 million colours).
Determining your compression needs is not difficult but does require an understanding of how the four factors mentioned above (frame rate, colour resolution, spatial resolution and image quality) affect your selection. As with most things, there is a price to pay for quality. With more colours, higher resolution, faster frame rates and better quality, you will need more computer power and will require more storage space for your video. By adjusting these parameters, you can dramatically change your digital video compression requirements.
Doing some simple calculations (see below) it can be shown that with 24-bit colour video, with 640 by 480 resolution, at 30 fps, requires an astonishing 26 megabytes of data per second! Not only does this surpass the capabilities of the many home computer systems, but also overburdens existing storage systems.
|640 horizontal resolution|
|X||480 vertical resolution|
|=||307, 200 total pixels per frame|
|X||3 bytes per pixel|
|=||921, 600 total bytes per frame|
|X||30 frames per second|
|=||27, 648, 000 total bytes per second|
|/||1, 048 576 to convert to megabytes|
|=||26.36 megabytes per second!|
Calculation to show space required for video is excessive
For some users, the way to reduce this amount of data down to a manageable level is to compromise on one of the four factors described above. Certain applications like games may not need a full 30 fps frame rate. By compromising these factors, it is possible to reduce the storage space to a reasonable amount, but the requirements may still be too big if the storage of large films is needed.
The goal of video compression is to massively reduce the amount of data required to store the digital video file, while retaining the quality of the original video. With this in mind, there are several factors that need be taken into account when discussing digital video compression :
6.1 Real-Time versus Non-Real-Time
The term real-time has been badly abused. In the compression world it means exactly what it says. Some compression systems capture, compress to disk, decompress and play back video (30 frames per second) all in real time; there are no delays. Other systems are only capable of capturing some of the 30 frames per second and/or are only capable of playing back some of the frames.
Insufficient frame rate is one of the most noticeable video deficiencies. Without a minimum of 24 frames per second, the video will be noticeably jerky. In addition, the missing frames will contain extremely important lip synchronisation data. In other words, if the movement of a person's lips is missing due to dropped frames during capture or playback, it is impossible to match the audio correctly with the video.
6.2 Symmetrical Versus Asymmetrical
This refers to how video images are compressed and decompressed. Symmetrical compression means that if you can play back a sequence of 640 by 480 video at 30 frames per second, then you can also capture, compress and store it at that rate. Asymmetrical compression means just the opposite. The degree of asymmetry is usually expressed as a ratio. A ratio of 150:1 means it takes approximately 150 minutes to compress one minute of video.
Asymmetrical compression can sometimes be more elaborate and more efficient for quality and speed at playback because it uses so much more time to compress the video. The two big drawbacks to asymmetrical compression are that it takes a lot longer, and often you must send the source material out to a dedicated compression company for encoding (adding time and money to the project).
6.3 Compression Ratios
A second ratio is often referred to when working with compressed video. This is the compression ratio and should not be confused with the asymmetry ratio.
The compression ratio relates the numerical representation of the original video in comparison to the compressed video. For example, 200:1 compression ratio means that the original video is represented by the number 200. In comparison, the compressed video is represented by the smaller number, in this case, that is 1. The more the video is compressed, the higher the compression ratio or the numerical difference in the two numbers.
Generally, the higher the compression ratio is, the poorer the video quality will be. With MPEG, compression ratios of 100:1 are common, with good image quality. Motion JPEG provides ratios ranging from 15:1 to 80:1, although 20:1 is about the maximum for maintaining a good quality image. Not only do compression ratios vary from one compression method to another, but hardware and software that perform well on a PC or Mac may be less efficient on a different machine.
6.4 Lossless Versus Lossy
The loss factor determines whether there is a loss of quality between the original image and the image after it has been compressed and played back (decompressed). The more compression, the more likely that quality will be affected. Virtually all compression methods lose some quality when you compress the data. Even if the quality difference is not noticeable, these are considered lossy compression methods. At this time, the only lossless algorithms are for still image compression. Lossless compression can usually only compress a photo-realistic image by a factor of 2:1.
6.5 Interframe Versus Intraframe
This is probably the most widely discussed and debated compression issue. The intraframe method compresses and stores each video frame as a discrete picture.
Interframe compression, on the other hand, is based on the idea that although action is happening, the backgrounds in most video scenes remain stable - a great deal of the scene is redundant. Compression is started by creating a reference frame. Each subsequent frame of the video is compared to the previous frame and the next frame, and only the difference between the frames is stored. The amount of data saved is substantially reduced.
6.6 Bit Rate Control
The final factor to be aware of regarding video compression is bit-rate control, which is especially important if your system has a limited bandwidth. A good compression system should allow the user to instruct the compression hardware and software which parameters are most important. In some applications, frame rate may be of paramount importance, while frame size is not. In other applications, you may not care if the frame rate drops below 15 frames per second, but the quality of those frames must be of very good.
Compression methods use mathematical algorithms to reduce (or compress) video data by eliminating, grouping and/or averaging similar data found in the video signal. Different algorithms are suited to different purposes. Although there are various compression methods, including Motion JPEG, only MPEG-1 and MPEG-2 are internationally recognised standards for the compression of moving pictures.
With so many factors to consider and so many companies trying to sell the best (sometimes conflicting) solutions, the decision process can be intimidating. A good rule of thumb is to stick to the standards. Standards do not guarantee that the solution is the best, but they are there for a reason. A classic example of this is when two large companies fought over video cassette standards : beta max versus VHS video tape formats. Beta max was clearly better, but millions of pounds were lost when VHS was adopted as the standard.
The MPEG formats are not a defacto standard . They are the internationally accepted ISO standard. Some of the most brilliant academics in the video and computer industries have spent years looking at every possible compression solution for full-motion video and are responsible for this specification and standard. This is also not an attempt by one company to push a proprietary format on the computer and video industry. This is an open format available to all.
The second thing to consider is the acceptance of the standard. There are so many standards to choose from that many real standards are clouded by self conflicting pseudo-standard. Real standards may not even be followed. There is also the problem of many companies pushing their own standard onto the market.
In conclusion :
In the next article I shall be discussing the MPEG standards, the technical details about MPEG and some applications.
|Tristan's MPEG Pages||Tristan Savatier (1996)||9|
|MPEG-2 Digital Video Technology and Testing|
BSTS Solution Note 5963-7511E
|Dragos Ruiu (Hewlett Packard - 1995)||8|
|Net Search||Netscape (1996)||5|
|MPEG Image Compression and ATM Networks||arad and ck4 (1996)|
|Compression of Audio and Video Information||Stephen Done (srd2|
All trademarks acknowledged.