Title: Status: Purpose: Author(s) or Contact(s)

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 4

Joint Collaborative Team on Video Coding (JCT-VC)

of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11


1st Meeting: Dresden, DE, 15-23 April, 2010

Document: JCTVC-A020 (r1)

Title:

Predictive Adaptive Transform Coefficients Scan Ordering for Inter-Frame Coding

Status:

Input Document to JCT-VC

Purpose:

Information

Author(s) or
Contact(s):

Xiang Li (Santa Clara University)


Lingzhi Liu (Huawei Technologies Co. Ltd.)
Nam Ling (Santa Clara University)
Jianhua Zheng (Hisilicon Technologies Co.
Ltd.)
Philipp Zhang (Hisilicon Technologies Co.
Ltd.)

Source:

Santa Clara University & Huawei Technologies Co. Ltd.

Tel:
Email:

+1-408-5545567
xiang.li@gmail.com

_____________________________

Abstract
In earlier video coding, the quantized coefficients are typically scanned in a zig-zag pattern. This
scanning order may not always be optimal for entropy coding. To achieve better entropy coding gain in
inter-frame coding, we propose a predictive adaptive scan ordering scheme for quantized transform
coefficients and show that additional inter-frame coding redundancy can be removed by our proposed
method. The scanning orders are dynamically updated based on the probabilistic distribution of quantized
coefficients in the previous frames.

1 Introduction
Transform has been widely used in many practical hybrid image and video compression systems. In
H.264/MPEG-4 AVC [1], residual images are transformed using integer transform, which has similar
properties as DCT transform. Studies on the distributions of DCT coefficients of images/videos [2] have
shown that the AC coefficients usually resemble Laplacian distributions. The variances of the
distributions across various coefficients become smaller when the frequencies become higher. For the
purpose of energy concentration, a 2-D array of transform coefficients is scanned in a zig-zag manner in
order to be converted to a 1-D array to pass to the entropy coder.
However, recent research showed that fixed zig-zag order may not always be the optimal way to scan
quantized coefficients. Algorithms [3]-[5] have been proposed to improve the scanning schemes for
H.264 intra prediction. Yoo et al. [3] describes an adaptive scanning scheme where scanning patterns are
updated based on the probabilistic distribution of quantized coefficients of the previous macroblocks for
intra modes. In [4] Choi et al. presents an adaptive coefficients scanning method which uses six
alternative scanning orders based on the intra prediction mode. Kim et al. [5] proposes an adaptive
scanning method by using the pixel similarity of the neighboring pixels to achieve enhanced intra coding
performance. Despite the improvements on intra prediction scanning, very few work have been done on
finding a more efficient scan order for inter-frame prediction where a large portion of data are compressed
with.
In this contribution, we propose an efficient scanning scheme for H.264 inter-frame coding. By using
this method, scan orders are adaptively updated according to the distribution of the previous quantized
transform coefficients. Additional inter coding redundancy can be removed by our method.

Page: 1

Date Saved: 2010-04-27

2 Analysis on the Distribution of Transform Coefficients


We observed that distributions of quantized transform coefficients of the same inter prediction mode
are very similar. This similarity not only exists between the blocks of the same mode in one frame, but
also exists between two consecutive frames.
To observe the distribution of transform coefficients, we take the absolute summation of all the
corresponding quantized coefficients in the matrices of the same mode. The reasons we choose this are
because both CAVLC and CABAC code magnitudes and signs separately, and the magnitude is more
important when coding a coefficient. Furthermore, the sum matrix can typically generalize the overall
distribution of the coefficients in a frame.
The 3-D surfaces in Figure 1 show the distribution of the video sequence Flower, where different
colors denote different value ranges. From Figure 1, one can see that the distributions of the transform
coefficients from the same mode in two consecutive frames (a) and (b), (c) and (d), are highly correlated.
Distributions of the transform coefficients of different modes in the same frame (a) and (c), (b) and (d),
are poorly correlated. Similar correlations were observed from other sequences. We therefore believe that
in most cases the distribution of transform coefficients can be predicted from the previous frame using
statistical information of the distribution of transform coefficients of the same mode.

Figure 1. Sequence Flower, absolute quantized coefficients summation matrices. X axis and Y axis represent the indices of each matrix, Z axis
is the summation of coefficients. (a) 1st P frame, mode 16x16; (b) 2nd P frame, mode 16x16; (c) 1st P frame, mode 16x8; (d) 2nd P frame, mode
16x8.

3 Adaptive Scanning Algorithm


Based on our analysis, we group the blocks of a frame by its inter prediction modes. A new scan
order is generated for each different prediction mode. These new scan orders are then also applied to the
next inter-predicted frame.
The core technique is the procedure to generate the adaptive scanning orders. This procedure is the
same for both the encoder and the decoder. Our method is prediction-based; the statistical information of
the coefficients in the current frame is used to generate the scan orders for the next frame. Both encoder
and decoder apply the same algorithm. Therefore, no side information needs to be sent from the encoder
to the decoder.
Figure 2 (a) and (b) shows the block diagram of the adaptive scanning algorithm integrated in H.264
encoder and decoder. The algorithm has three steps:
1) For each inter predicted frame, initialize a set of 2-D arrays {S1[n][n], S2[n][n], , Sm[n][n]},
where m is the number of inter prediction modes, n is the size of the transform in each dimension. For 4x4
transform, n=4, for 8x8 transform n=8. S is used to hold the summation matrices of the coefficients.
2) For each block of size nn,
a) If this is the first P frame, use the standard zig-zag scan order, else use the generated adaptive
order according to the inter prediction mode.
b) Add the absolute value of the quantized coefficients of this block to Si[n][n], i is the selected inter
prediction mode for this block.

Page: 2

Date Saved: 2010-04-27

(a)

(b)
Figure 2. Flow chart of coding procedure: (a) encoder side; (b) decoder side.

3) For each inter prediction mode i (i = 1 to m),


a) Sort the values in Si[n][n] in descending order, while sorting, keep the original indices of each
value in order.
b) Construct a 2-D array with ordered original indices. This will become the new scan order for this
mode in the next inter predicted frame.
The computational complexity introduced by this algorithm is negligible. For each block in a frame a
matrix addition is required. For each inter prediction mode, a sort is required every one frame. Both
addition and sorting procedures could be implemented using only binary operations.
Unlike some other adaptive coefficients scanning algorithms [4] [5], which select the scan order
from the set of predetermined scan orders, our method generates a new set of scan orders for each frame.
It has a low computational complexity since the algorithm runs at the frame level. Another advantage of
our algorithm is that we take into consideration not only the locations of the zero coefficients, but also the
global order of all coefficients; as a result, keeping the scanned values within its context.

4 Experimental Results
Average Bit Reduction Between the Reference (JM 15.1) and the Proposed Scheme.
QP

PSNR
(Y) (JM
15.1)

P frame
ave.
bits
/frame
(JM 15.1)

PSNR (Y)
(proposed)

P frame
ave. bits/
frame
(proposed)

city

coastguard

flower

Ave.

ave. bit
increase

-3.62%
23

40.97

109999

40.98

104115

28

36.25

62476

36.28

59009

33

31.70

30932

31.78

29589

38

27.19

12145

27.27

11808

23

39.59

98366

39.60

94515

28

35.43

52347

35.46

50299

33

31.56

22197

31.65

21508

38

28.19

6848

28.26

6746

23

39.37

44469

39.44

43990

28

35.46

18472

35.53

18292

33

31.91

7776

31.96

7747

38

28.64

3098

28.63

3072

Page: 3

-5.63%

-4.70%

-2.00%

Date Saved: 2010-04-27

mobile

23

39.74

153130

39.74

149276

28

34.99

80933

35.02

79112

33

30.49

32378

30.54

32044

38

26.51

10660

26.52

10725

-2.15%

Implementation on kta2.6r1 is finished. More results will be provided.

5 Conclusion
We propose a prediction-based adaptive transform coefficient scanning method for inter prediction
coding in H.264. The proposed adaptive scanning method groups the blocks by the inter prediction modes
and uses a different scanning order for each group. The adaptive scan orders are generated based on the
statistical information of the same modes in the previous inter predicted frame. Experimental results show
that the proposed method improves the coding efficiency by an average of 3.62% and up to 5.63%.
Grouping the blocks by the inter prediction mode is intuitive and straightforward, and proven to improve
the efficiency of the context-based entropy coding.
Compared to other adaptive scan order methods, the proposed method has several advantages:
The scan order is updated once per frame thus the characteristic of each frame is considered timely.
The scan order is predicted from the previous coded frame. The information is valid on both encoder
and decoder sides and no side information is required to transfer the updated orders.
Only binary operations and a small amount of memory are needed for the calculation and storing of
the intermediate results. The complexity is negligible.

6 References
[1]
[2]
[3]
[4]
[5]
[6]
[7]

T. Wiegand, G. Sullivan, G. Bjntegaard, and A. Luthra, Overview of the H.264/AVC Video Coding Standard, IEEE Trans. Circuits Syst.
Video Technol., vol. 13, no. 7, pp. 560-576, July 2003.
E. Lam, and J. W. Goodman, A Mathematical Analysis of the DCT Coefficients Distributions for Images, IEEE Trans. Image Processing,
Vol. 9, No. 10, pp. 1661-1666, Oct. 2000.
Y.-J. Yoo, and S. Jeong, Adaptive Scan Pattern for Quantized Coefficients in Intra Coding of H.264, IEICE Trans. Inf. & Syst. Vol. E92D, No.4, pp. 750-752, Apr. 2009.
B.-D. Choi, J.-H. Kim, and S.-J. Ko, Adaptive Coefficient Scanning Based on the Intra Prediction Mode, ETRI Journal, Vol. 29, No. 5,
pp.694-696 Oct. 2007.
D.-Y. Kim, D.-K. Kim, and Y.-L. Lee, Adaptive Scanning Using Pixel Similarity for H.264/AVC,, IEICE Trans. Fundamentals, Vol. E90A, No. 5, pp.1709-1711 May 2007.
Joint Video Tem Reference Software JM15.1. [http://iphome.hhi.de/suehring/tml/]
G. Bjntegaard, Calculation of Average PSNR Differences Between RD-Curves, ITU-T Q.6/SG16 VCEG, VCEG-M33, April 2001.

7 Patent rights declaration(s)


Huawei Technologies Co., Ltd. may have IPR relating to the technology described in this
contribution and, conditioned on reciprocity, is prepared to grant licenses under reasonable and
non-discriminatory terms as necessary for implementation of the resulting ITU-T Recommendation
| ISO/IEC International Standard (per box 2 of the ITU-T/ITU-R/ISO/IEC patent statement and
licensing declaration form).

Page: 4

Date Saved: 2010-04-27

You might also like