Fast mode decision algorithms for high efficiency video coding (HEVC) and image matting

HKUST Electronic Theses

Fast mode decision algorithms for high efficiency video coding (HEVC) and image matting

by Yongfang Shi

THESIS 2013

M.Phil. Electronic and Computer Engineering

xiii, 65 p. : ill. ; 30 cm

Abstract

The High Efficiency Video Coding (HEVC) is the next generation video coding standard beyond H.264/AVC. As a newly established standard mainly targeted for the efficient compression of high-definition (HD) video and ultra-HD contents, HEVC adopts many new techniques and more elaborate coding tools such as nested quad-tree structure, advanced motion vector prediction (AMVP), asymmetric motion partition (AMP), etc. In HEVC, the basic processing unit – coding tree unit (CTU) has a larger size of 64 × 64 luma samples and adopts a nested quad-tree based partitioning scheme which supports a hierarchy up to 4, compared to merely 16 × 16 macroblock (MB) and a limited choice of block sizes in H.264/AVC. Also, HEVC has 35 intra prediction modes (IPM) while H.264/AVC has only up to 9 modes for intra prediction. Although these new techniques contribute a lot to the coding efficiency gain of the HEVC encoder, they add significant complexity to the encoder. Thus it is of great importance to study fast algorithms to make it more practical in real applications.

To speed up the HEVC encoder, we propose two fast algorithms in this thesis. The first fast algorithm is called fast PU quad-tree depth decision (FPDD) algorithm and speeds up the encoding process by skipping less probable PU sizes. It is achieved by making use of the inherent correlation of PU quad-tree structure between current CTU and its spatial and temporal neighbors. To reduce error propagation, we also propose a confidence grading scheme to prevent CTUs with bad prediction from being referred to by others. Results show that FPDD algorithm provides averagely 20.0% (up to 39.3%) encoding time reduction whilst causing negligible RD performance loss (0.2% BD-Rate increase on average) compared with HM 7.0.

The second one is called fast rough mode decision (FRMD) algorithm which further reduces the number of IPM for full RDO process. In FRMD, we analyzed the costs generated by rough mode decision (RMD), which has already been incorporated in the HM software. We found that the RMD costs listed by mode number generally follow the same trend with the rate-distortion optimization (RDO) costs. Further, the local salient modes, whose RMD costs have a significant drop compared with adjacent modes, tend to be promising competitors for the optimal mode. Based on these observations, we further reduced the number of the candidates for the RDO process. Experimental results show that FRMD algorithm achieves averagely 19.0% (up to 33.6%) encoding time saving whilst causing negligible RD performance loss (0.4% BD-Rate increase on average) compared with HM 7.0 anchor.

Apart from the above video technology research, the thesis also contains my recent work on natural image matting which is a classical topic but has aroused more and more attention recently. Basically, natural image matting refers to the problem of extracting regions of interest such as foreground object from an image based on user inputs like scribbles or trimap. More specifically, we need to estimate the color information of background, foreground and the corresponding opacity, which is an ill-posed problem inherently. Inspired by closed-form matting and KNN matting, we extend the local color line model which is based on the assumption of linear color clustering within a small local window, to nonlocal feature space neighborhood. A novel nonlocal color ball model is eventually introduced. With the proposed model, we capitalize on the nonlocal principle which gathers pixels with similar appearance from the whole image. New affinity matrix is defined to achieve better clustering which ensures better prediction of alpha matte. Finally, a closed-form solution to the matting problem is achieved by solving a sparse linear system. Experimental evaluations on benchmark datasets and comparisons show that our results are of higher accuracy and better visual quality than some state-of-the-art matting algorithms.

[ Hide abstract ]

View Copyrighted to the author. Reproduction is prohibited without the author’s prior written consent.

Details

Collection HKUST Electronic Theses Degree M.Phil. Department Electronic and Computer Engineering Authors Shi, Yongfang Subjects Video compression Digital video Image compression Language English Call number Thesis ECED 2013 Shi DOI 10.14711/thesis-b1240243

Full record

Fast mode decision algorithms for high efficiency video coding (HEVC) and image matting

by Yongfang Shi

Post a Comment Cancel reply