THESIS
1998
84 leaves : ill. ; 30 cm
Abstract
Living in an Era of information, we will encounter a lot of multimedia content in our daily life. In particular, image and video data are the most common information transmitted throughout World Wide Web, Internet and most of the networks. Since we want real time transmission with the minimum delay, we have to make the transfer as fast as possible. Increasing the bandwidth is a direct and easy way to achieve the speed requirement. However, we cannot indefinitely increase the bandwidth and it is not cost-efficient. Also the bandwidth should be reserved for other kinds of communication in order to fully utilize the valuable bandwidth. It seems that to reduce the amount of video/image data to be transmitted is the only option for us. This leads to nowadays' state-of-the-art video/image com...[
Read more ]
Living in an Era of information, we will encounter a lot of multimedia content in our daily life. In particular, image and video data are the most common information transmitted throughout World Wide Web, Internet and most of the networks. Since we want real time transmission with the minimum delay, we have to make the transfer as fast as possible. Increasing the bandwidth is a direct and easy way to achieve the speed requirement. However, we cannot indefinitely increase the bandwidth and it is not cost-efficient. Also the bandwidth should be reserved for other kinds of communication in order to fully utilize the valuable bandwidth. It seems that to reduce the amount of video/image data to be transmitted is the only option for us. This leads to nowadays' state-of-the-art video/image compression standards.
Video compression standards use a sophisticated prediction scheme to exploit the large temporal redundancy between successive frames of the video sequences. This scheme is called Motion Estimation. Motion Estimation is one of most computational intensive parts of the entire video encoding process. For high quality video compression, Full-search (FS) is chosen for the motion estimation algorithm. FS can always give the most accurate motion prediction and provide the best picture, but the computational requirement is just too heavy. Even advanced and sophisticated hardware chips are available to address this problem, we still want to have some less expensive software-based video encoding solution.
In this Thesis, a new fast motion estimation algorithm will be introduced. It is based on (1) Search window sub-sampling (SWS), and (2) Key Feature Pixels (KFP). SWS is used to achieve a speed-up factor of about 4 (compared with FS). The use of KFP is a more efficient and accurate way to extract the features from a macroblock for block matching. And it can further achieve another 4 or 8 times speed-up upon SWS. The algorithm combines both techniques to achieve comparable results at 16 or even 32 faster than FS. The performance (in term of MSE) of the proposed algorithm is in-between Three-Step Search (TSS) and FS. Thus, we have gained both higher speed and higher accuracy in comparison with conventional fast motion estimation algorithms.
In the second part of this Thesis, a new color image compression algorithm will be introduced. The original idea for the algorithm is a wild one that we try to encode an image as a video sequence, and use low bit-rate video compression standards (like H.263) to compress it. An image is sub-sampled into 16 smaller frames, and these 16 frames are regarded as a small-motion sequence. With the new motion estimation algorithm introduced in the first half of the Thesis, we may achieve a more accurate temporal prediction upon these frames. However, experimental results showed that motion estimation is just playing a very insignificant role in encoding this kind of 'sequences'. As a result, we have replaced the low bit-rate video coding with a better prediction scheme. This prediction scheme made use of the difference pyramid structure. Combined with the use of DCT, the algorithm can more efficiently exploit the redundancy that exists within the image. It is a DCT-based algorithm such that it can provide a scalable compression ratio to suit different requirements. The manipulation of the image as a difference pyramid supports progressive trans-mission of the image. It is specially important and practical in nowadays' intemet applications.
Post a Comment