THESIS
2012
ii, 1 unnumbered page, viii, 58 pages : illustrations ; 30 cm
Abstract
Two important components in current image and video standard are intra-prediction and transform coding, especially in H.264. The first part in this thesis
exhibits some important features in the intra-predicted residual signals. First, it is shown that the 2-D residual block in the vertical and horizontal modes (Modes 0 and 1) is separable so that the optimal coding performance can be achieved by a cascade of two 1-D transforms. Second, from the structural point of view, we show that the directionality in all six directional modes (Modes 3-8) remains strong. Third, from the statistical point of view, we show that the residual signal in these six directional modes has a column-wise (or row-wise) covariance matrix that is far from the Toeplitz-type. The second feature implies that such 2...[
Read more ]
Two important components in current image and video standard are intra-prediction and transform coding, especially in H.264. The first part in this thesis
exhibits some important features in the intra-predicted residual signals. First, it is shown that the 2-D residual block in the vertical and horizontal modes (Modes 0 and 1) is separable so that the optimal coding performance can be achieved by a cascade of two 1-D transforms. Second, from the structural point of view, we show that the directionality in all six directional modes (Modes 3-8) remains strong. Third, from the statistical point of view, we show that the residual signal in these six directional modes has a column-wise (or row-wise) covariance matrix that is far from the Toeplitz-type. The second feature implies that such 2-D residual blocks become non-separable; while the third one tells us that the DCT along the horizontal or vertical direction will not work as efficiently as one expects. Based on these results, we derive the non-separable Karhunen-Loève transform (KLT) so as to obtain the R-D performance upper-bound for each directional mode. Simulation results based on the coding gain and energy packing efficiency (EPE)
exhibit substantial gaps between such non-separable KLTs and the traditional 2-D DCT.
In second part of this thesis, we attempt to design some new transforms with two goals: i) approaching to the KLT's R-D performance and ii) maintaining the implementation cost no bigger than that of DCT. To this end, we follow a cascade structure of multiple butterflies to develop an iterative algorithm: two out of N nodes are selected at each stage to form a Givens rotation (which is equivalent to a butterfly); and the best rotation angle is then determined by maximizing the resulted coding gain. We give the closed-form solutions for the node-selection as well as the angle-determination, together with some design examples to demonstrate their superiority. Tests in still images and real video sequences show that a remarkable coding performance gain can be achieved by our proposed framework.
Post a Comment