Professional Documents
Culture Documents
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 18, NO. 1, JANUARY 2008
I. INTRODUCTION
ATE control (RC) aims to achieve good perceptual quality
given the transmission bandwidth constraint. Usually, RC
whilst optimizing
regulates the coded bit stream by adjusting
the video presentation quality. To achieve this, the rate-quantization (R-Q) model is often employed for representing the
and other parameters such as the
coded bits by means of
variance of a residual macroblock (MB) [1], the mean absolute
difference (MAD) of a residual MB [2][6], and the percentage
of zero quantized coefcients [7]. Unfortunately, using other parameters such as MAD for R-Q modelling causes the chickenand-egg dilemma [8] to the state-of-the-art video coding standard H.264 [9] because the Lagrangian method [10] employed
available before mode decision but until the
in H.264 needs
end of mode decision, RC cannot access the statistics such as
.
MAD for determining
It is worthwhile to point out that several RC algorithms have
been developed to overcome the H.264 RC chicken-and-egg
dilemma. Ma et al. [11] propose a two-pass RC scheme with
each scheme applying a TM5 [2] based method. If the rst
, the second pass is impass fails to obtain an appropriate
plemented as a renement. The encoding complexity is consequently increased. Later, Ma et al. [12] further develop an
improved partial two-pass RC algorithm for RC performance
improvement with a linear R-Q model being proposed. In [8],
a one-pass RC algorithm JVT-G012 is proposed with a linear
MAD prediction model being used to circumvent the chickenand-egg dilemma and the conventional MPEG-4 Q2 R-Q model
given the allocated bits and
[3] being employed to calculate
predicted coding complexity. Due to its efciency, JVT-G012
Manuscript received December 4, 2006; revised March 28, 2007. This work
was supported in part by City University of Hong Kong Strategic under Grant
7002023. This paper was recommended by Associate Editor Z. He.
The authors are with the Department of Computer Science, City University
of Hong Kong, Hong Kong (e-mail: wanghl@cs.cityu.edu.hk; cssamk@cityu.
edu.hk).
Digital Object Identier 10.1109/TCSVT.2007.913757
where the mean squared error MSE represents the quality disis the model parameter;
is the quantizatortion;
141
Fig. 1. Fitting accuracy of the linear D-Q model. (a) Garden. (b) Paris.
(2)
where
is the resultant optimal quantization step size for MB
; is the total number of MBs in a frame; , and are the
improved MPEG-4 Q2 model parameters for MB ; is the bit
budget for the remaining MBs of a frame;
is the predicted
header bit budget of the MB which is equal to that of the colocated MB in the previous frame of the same type;
and
are the predicted MAD values for MB and MB , respectively,
which are computed by the linear MAD prediction model [8] for
circumventing the H.264 RC chicken-and-egg dilemma. Due to
the simplicity, the sliding-window based data selection mechanism [4] is used to update the model parameters of the improved
Q2 model and the linear MAD model according to the linear regression. It is observed that the D-Q model parameter is dissolved in (2). When
is obtained from (2), we can deduce
the optimal quantization parameter
from
to encode the
th MB.
B. Initial
Determination
(3)
where
is the start
value to encode the rst I-frame;
with , and
being the bitrate, frame
rate and total number of pixels in a frame, respectively;
,
, are the predetermined thresholds and
Fig. 2. Relation curves between the best initial quantization parameter and BPP
for News, Foreman, and Mobile Calendar.
142
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 18, NO. 1, JANUARY 2008
(6)
where
(7)
(8)
where
is the parameter set.
indicates the initial
value
depends on BPP, and we use the data of News
sequence to derive the parameters in (7) by linear regression
method.
implies that
adapts itself to the picture
content, expressed by EI and IM, of the rst I-frame. We make
use of the EI and IM values, plus the differences 0, 5, and
12 as the expected
values for News, Foreman,
and Mobile Calendar, respectively, to design the
model. After trial, we nd that (8) is a better choice to model
. In this work, we set
and the
143
(9)
where
is the PSNR performance of the test algorithm,
is
the PSNR performance resulted from JVT-G012 [8];
is the
is the target bibitrate performance of the test algorithm and
and
values
trate. Due to the space limit, we average the
obtained at various target bitrates for each of the test video sequences and list the average results in Table I.
From the average
result, we can see that the proposed
PRC-1 can achieve better PSNR results than the two algorithms
initialization scheme is
[12] and [8]. When the proposed
enabled, the PSNR performance is further improved by PRC-2.
As for the average
result, all the four algorithms obtain similar bitrate performances. In addition, we illustrate the
buffer occupancy for two test cases in Fig. 5. From these plots,
we can see that the propose algorithm PRC-1 can maintain
suitable buffer occupancy levels which is similar to that of
the algorithm [12], and both of these two algorithms achieve
better performances than the algorithm [8] in terms of prevention of the buffer overow (the buffer occupancy is higher than
80%) and underow (the buffer occupancy is lower than 0%).
initialization scheme is enabled, the
When the proposed
stability of the buffer occupancy is further improved by PRC-2
as compared with PRC-1. Moreover, the frame-to-frame PSNR
curves are shown in Fig. 6, where we can see that similar or
better results can be obtained by the proposed PRC-1 algorithm
than [12] and [8]. When the adaptive initialization scheme is
enabled, the performance is further improved by PRC-2 as
compared with PRC-1.
144
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 18, NO. 1, JANUARY 2008
Fig. 4. R-D curves. (a) News. (b) Garden. (c) Paris. (d) Mobile Calendar.
TABLE I
COMPARISON OF TOTAL NUMBER OF FRAMES SKIPPED, AVERAGE
ACKNOWLEDGMENT
The authors would like to thank Dr. S. Ma and Dr. Y. Lu for the
useful discussion and the source code of the RC algorithm [12].
Fig. 5. Buffer occupancy. (a) News, 1800 kbps. (b) Mobile Calendar,
80 kbps.
Fig. 6. PSNR curves. (a) Garden, 800 kbps. (b) Paris, 80 kbps.
V. CONCLUSION
In this paper, a R-D joint optimized RC algorithm with adapinitialization is presented for H.264. A linear model is
tive
proposed to model the D-Q relation and a close-form solution
is applied to calculate optimal
values for MB encoding. The
deproposed RC algorithm also employs an adaptive initial
termination scheme which not only depends on BPP but also
considers the specic content of video sequences. The experimental results have demonstrated that the proposed RC algorithm outperforms the other two RC algorithms [12] and [8] in
terms of R-D performance, the number of frames skipped and
the buffer overow and underow prevention.
REFERENCES
[1] Video Codec Test Model, TMN8, ITU-T/SG16. Porland, Jun. 1997.
[2] Test Model 5 [Online]. Available: http://www.mpeg.org/MPEG/
MSSG/tm5
[3] T. Chiang and Y. Q. Zhang, A new rate control scheme using quadratic
rate distortion model, IEEE Trans. Circuits Syst. Video Technol., vol.
7, no. 1, pp. 246250, Feb. 1997.
[4] H. J. Lee, T. Chiang, and Y. Q. Zhang, Scalable rate control for
MPEG-4 video, IEEE Trans. Circuits Syst. Video Technol., vol. 10,
no. 6, pp. 878894, Sep. 2000.
[5] B. Xie and W. Zeng, A sequence-based rate control framework for
consistent quality real-time video, IEEE Trans. Circuits Syst. Video
Technol., vol. 16, no. 1, pp. 5671, Jan. 2006.
[6] I. Ahmad and J. Luo, On using game theory to optimize the rate control in video coding, IEEE Trans. Circuits Syst. Video Technol., vol.
16, no. 2, pp. 209219, Feb. 2006.
[7] Z. He, Y. K. Kim, and S. K. Mitra, Low-delay rate control for DCT
video coding via -domain source modeling, IEEE Trans. Circuits
Syst. Video Technol., vol. 11, no. 8, pp. 928940, Aug. 2001.
[8] Z. Li, F. Pan, and K. P. Lim et al., Adaptive Basic Unit Layer Rate
Control for JVT Doc. JVT-G012-r1, Thailand, Mar. 2003.
[9] Advanced Video Coding for Generic Audiovisual Services ITU-T Rec.
H.264, Ver. 3, Mar. 2005.
[10] G. J. Sullivan and T. Wiegand, Rate-distortion optimization for video
compression, IEEE Signal Process. Mag., vol. 15, no. 6, pp. 7490,
Nov. 1998.
[11] S. Ma, W. Gao, Y. Lu, and D. Zhao, Improved Rate Control Algorithm
Doc. JVT-E069. Geneva, Oct. 2003.
[12] S. Ma, W. Gao, and Y. Lu, Rate-distortion analysis for H.264/AVC
video coding and its application to rate control, IEEE Trans. Circuits
Syst. Video Technol., vol. 15, no. 12, pp. 15331544, Dec. 2005.
[13] Y. Liu, Z. G. Li, and Y. C. Soh, A novel rate control scheme for low
delay video communication of H.264/AVC standard, IEEE Trans. Circuits Syst. Video Technol., vol. 17, no. 1, pp. 6878, Jan. 2007.
[14] H. Wang and S. Kwong, A rate-distortion optimization algorithm for
rate control in H.264, in Proc. IEEE ICASSP, Apr. 2007, vol. I, pp.
11491152.
[15] I. E. G. Richardson, H.264 and MPEG-4 Video Compression: Video
Coding for Next-Generation Multimedia. New York: Wiley, 2003, pp.
177180.
[16] A. Gersho and R. M. Gray, Vector Quantization and Signal Compression. Norwell, MA: Kluwer, 1992.
[17] H.264/AVC Reference Software JM9.5 [Online]. Available:
http://www.iphome.hhi.de/suehring/tml/