You are on page 1of 5

Proceedings of the 2007 International Conference on Wavelet Analysis and Pattern Recognition, Beijing, China, 2-4 Nov.

2007

A New Unit Layer Linear Rate Control Algorithm for H.264 Based on PID
Controller
PING XU, YING-LE FAN, YI LI, QUAN PANG

Institute of Biomedical Engineering & Instrument, Hangzhou Dianzi University,


Hangzhou, 310018
E-MAIL:xuping@hdu.edu.cn, fan@hdu.edu.cn, liyi@hdu.edu.cn, pq5142@163.com

Abstract [7][8], He et al. presented a rate-distortion(R-D) model


based on the fraction of zeros among the quantized DCT
This paper presents a new unit layer rate control coefficients(denoted as ρ) and a rate control method based
algorithm for H.264 combining PID controller with linear on this model. This method is reported to provide more
rate model. In various video coding standards, like accurate bit rate estimation than the existing rate control
MPEG-2, H.263, MPEG-4 and H.264, it has been reported methods. PID-based rate control method that has been
that the linear rate control algorithm can achieve more proposed in [9][10] can achieve accurate bit rate and obtain
accurate and robust rate control. In fact, human is not only more consistent quality while keeping high spatial quality.
sensitive to spatial quality, but also to temporal quality. In Considering that more accurate bit rate estimation can
order to have better tradeoff between spatial and temporal be achieved by He’s method and better picture quality by
quality and obtain more consistent quality, the PID the PID controller, in this paper, we propose a new rate
controller is inducted into unit layer linear rate control control algorithm for H.264 that is based on the linear rate
algorithm for H.264. The scene changes are also effectively model and PID controller to achieve quite small bit rate
dealt with. Experimental results show that the proposed estimation error, obtaining high spatial picture quality
rate control algorithm can not only track target bit rate while improving temporal picture quality. We also apply a
more accurately and achieve significantly smaller bit rate scene-change handling method to the PID controller to
estimation error, like that of linear rate control algorithm, lower the impact of a scene change. Because of high
but also improve the temporal quality while keeping high computation complexity of the quadratic R-D model in
spatial quality, and the quality of frames with scene [9][10], in this paper, we only use the INTER_16×16 mode
changes can obviously be improved. to perform the pre-analysis process to get the distribution of
transform coefficients and achieve more accurate bit
Keywords:H.264, PID Controller, Linear Rate Model, allocation.
Rate control, Basic unit, Bit allocation, Scene Changes This paper is structured as follows: The next section
introduces the linear rate model for H.264. PID controller is
provided in the section 3. Section 4 describes the details of
1. Introduction
the rate control algorithm. Section 5 offers the experimental
results. The last section draws conclusion.
Rate control plays an important role in H.264[1]. Many
existing rate control algorithms are based on a quadratic
2. Linear rate model for H.264
rate-quantizer(R-D) model. Lee et al.[2] proposed a scalable
rate control algorithm which can be simultaneously applied
at frame layer, object layer and macroblock layer. The In [7][8], He et al. draw the conclusion that there is
mean absolute difference (MAD) parameter and overhead linear relationship between ρ and rate in a frame in the
have been introduced into the quadratic R-D model to coding standard such as JPEG, MPEG2, H.263 and MPEG4,
accurately estimate the target bit rate. Li et al.[3][4][5] which can be denoted by:
proposed the rate control algorithm, adopted by the Joint R ( ρ ) = θ (1 − ρ ) (1)
Video Team(JVT) for H.264/AVC, which employs a linear where R is the bit rate and θ the model parameter.
MAD predict model to Although the newest video coding standard H.264 is quite
different from the previous standard, there still exists linear
solve dilemma between rate control and RDO. The concept relationship between R and ρ[12]. A frame layer rate
of basic unit introduced by Li is to obtain a good tradeoff control could achieve a high PSNR but a big bit fluctuation,
between the bit fluctuation and the coding efficiency. The while MB layer rate control could have small bit
basic unit can be a macroblock (MB), a slice, or a frame. In fluctuation with slight loss in PSNR. A basic unit layer rate

1-4244-1066-5/07/$25.00 ©2007 IEEE


33
Proceedings of the 2007 International Conference on Wavelet Analysis and Pattern Recognition, Beijing, China, 2-4 Nov. 2007

control can be used to obtain a good trade-off. 4. Rate control algorithm

3. PID controller A. Frame Layer Bit Allocation


For simplicity, we just consider the IPPP type sequence.
Because of its simplicity and good performance, the In the beginning of the ith GOP(Group of Picture), the total
PID controller is by far the most popular feedback number of bits allocated for the ith GOP is computed as
controller in the automatic field. We use the PID technique follows:
to keep the buffer occupancy around the target buffer Ri (1)
fullness and minimize the deviation between the target Si (1) = N i + Si −1 ( N i −1 ) (5)
buffer fullness and the actual buffer fullness(see Fig.1). The f
PID controller can be defined as Si is updated frame by frame as follows:
1 t dE Ri ( j) − Si ( j −1)
PIDt = K p ⋅ ( Et + ⋅ ∫ Eτ ⋅ dτ + Td ⋅ t ) (2) Si ( j) = Si ( j −1) −bi ( j −1) + (Ni − j +1) (6)
Ti 0 dt f
where K p , Ti and Td are the proportional gain, Where bi ( j ) and Ri ( j ) are the coding bit count and
integral factor, and derivative factor, respectively. The error the available channel bandwidth of the jth frame of the ith
signal Et , which measure the difference between the target
GOP,
Ni is the frame number of the ith GOP and f
buffer fullness Btarget,t and the current buffer fullness Bf,t at
is the predefined frame rate.
time t, is defined as
The initial target buffer size is
( Bt arg et ,t - B f ,t ) Bt arg et ,i (1) = Vi (1) (7)
Et = (3)
Bt arg et ,t
Where Vi (1) denotes the actual bit count after coding
There are three terms in (2). The first is the proportional
action which is the main component and can reduce the
the I frame, Bt arg et ,i ( j ) denotes the target buffer
error between the current fullness and the target buffer occupation of the jth frame of the ith GOP. The target
fullness, but cannot completely eliminate the error. The buffer occupation of the (j+1)th frame of the ith GOP is
second is the integral controller which can eliminate the Bt arg et ,i (1) − Vs / 8
effect of steady-error. The last has the effect of improving Bt arg et ,i ( j + 1) = Bt arg et ,i ( j ) − (8)
the transient stability and increasing the stability of the Np
system. Where Vs denotes the total buffer size and N p the
1

Bt arg et ,t Et
Ti
+ B f ,t
count of uncoded P frames of the ith GOP.
+
k p PIDt buffer The target bit count of the jth frame of the ith GOP is
+ − +
d R ( j)
Ti ( j ) = (1 + PIDi ( j )) i
Td
dt (9)
f
Fig.1 PID control system
In the case of discrete case, the PID controller becomes In order to deal with the scene change case, the output
of the PID controller is adjusted by
K p ⋅ ∆t i
∆Et Et = ∆Et = 0 , if abs( Et ) > 1.5
PIDt = K p ⋅ Et +
Ti
∑E
j =0
j + K p ⋅ Td ⋅
∆t Meanwhile, the number of remaining bits should also
(10)

be considered when the target bit is computed.


i (t )
∆Et
= K p ( Et + Ki ∑ E j + K d ) (4)  S ( j)
j =0 ∆t Ti ( j + 1) = i (11)
Np
where K i and K d are integral and derivative parameter,
Then, the ultimate target bit is
respectively. 
Ti ( j + 1) = β Ti ( j + 1) + (1 − β )Ti ( j + 1) (12)

Where β is a constant, and its typical value is 0.5.

34
Proceedings of the 2007 International Conference on Wavelet Analysis and Pattern Recognition, Beijing, China, 2-4 Nov. 2007

B. The Computation of ρ frame and to avoid the dilemma between rate control and
In H.264, the coefficients, computed by 4×4 RDO. Before RDO, the INTER_16×16 mode with the
integer-to-integer transform, are nonlinear scalarly mean quantization parameter of all basic units of previous
quantized. The range of quantization parameter QP is frame is adopted to code all macroblocks of current frame
from 0 to 51. The chroma quantization parameter to obtain the transform coefficient distribution D( x) of

QPchroma can be deduced by luma quantization parameter ρl ,i ( j + 1) l th basic
current frame, then initial from the
QPluma . Once QP is added by 6, the Qstep doubles. unit to the end of the current frame is got according to (15)
For transform coefficient xij , 0 ≤ i , j < 4 , its and the number of basic unit head bits, motion bits and
quantization can be computed by
texture bits are all saved temporally.
 x M (i, j ) + λ 2 qbits

xij (QP ) = Round ( ij ) (13) D. Basic Unit Level Bit Allocation


2qbits The quantization parameter of current basic unit can be
qbits = 15 + floor (QP / 6) (14)
determined according to the available number of texture bits of
where M (i , j ) depends on QP and is given by tableⅠ uncoded part in the current frame, which can be calculated as
[13]
. Once QP is added by 6, M (i , j ) keeps the same; follows:
 
1 1 ⎧ bl ,i ( j + 1) - ml ,i ( j + 1) 
λ is the threshold, for the intra mode and for the ⎪Tr ,i ( j + 1)  , bl ,i ( j + 1) > 0 (17)
3 6 bl ,i ( j + 1) = ⎨ bl ,i ( j + 1)
⎪ 
inter mode. ⎩Tr ,i ( j + 1), bl ,i ( j + 1) = 0
TABLE I
VALUE OF M (i , j ) where bl ,i ( j + 1) denotes the target number of texture bits

position position
from the basic unit l to the end, Tr ,i ( j + 1) denotes the
QP (0, 0), (0, 2), (2, 0), (2, 2) (1, 1), (1, 3), (3, 1), (3, 3)
Other position
available number of bits of the remainder unencoded basic
0 13107 5243 8066
1 11916 4660 7490 units and its initial value is i T ( j + 1) ,the sum of basic
2 10082 4194 6554  
3 9362 3647 5825 units is N , bl ,i ( j + 1) and ml ,i ( j + 1) denotes the total
4 8192 3355 5243
bits and header bits from the basic unit l to the end in
ρ can be computed by pre-analysis process, respectively.
1
ρ (QP) = ∑
L xij ≤(1−λ )2qbits / M (i , j )
D( x) (15) E. The Computation of the Quantization Parameter
The quantization parameter of macroblock row can be
where D( x) denotes the distribution of transform computed by the following cases:
Case 1: If the current macroblock row is the first one in the
coefficients in a macroblock row and L the number of current P frame, the quantization parameter is given by
the transform coefficients in the macroblock row.
From the (1), QP0,i ( j + 1) = QPi ( j ) (18)
R0 − R1 (1 − ρ 0 ) Case 2: Tr ,i ( j + 1) < 0 . The quantization parameter can
ρ1 = (16)
R0 be given by the following steps: the quantization parameter
is given by
Where R0 and R1 are the initial and target bit count,
QPl ,i ( j + 1) = QPl −1,i ( j + 1) + 1 (19)
ρ0 and ρ1 are intial and target value, respectively. To keep the visual quality, the quantization parameter is
further bounded by
C. ThePre-analysis Process
QPl ,i ( j + 1) = max{QPi ( j ) − 2,
Here we only consider the P frame case. The aim of the (20)
min{QPi ( j ) + 2, QPl ,i ( j + 1)}}
pre-analysis process is to impose small computation to get
Case 3: the rest case, the quantization parameter can be
the distribution of transform coefficients of current coded given by the following steps:

35
Proceedings of the 2007 International Conference on Wavelet Analysis and Pattern Recognition, Beijing, China, 2-4 Nov. 2007

Step 1: The available texture bits bl ,i ( j + 1) of uncoded Ⅲ shows that the proposed method can obtain smallest
part according to the section D. PSNR variation over frame while obtaining the same
Step 2: The target ρ of uncoded part can be computed reconstructed quality as that of LRC and JM8.4.
by (16): For the non-typical sequence, the testing condition is as
  
ρl,i ( j +1) =(bl,i ( j +1)−ml,i ( j +1) −bl,i ( j +1)(1−ρl,i ( j +1))) follows: the length of GOP is set as 150, the number of
  (21)
/(bl,i ( j +1)−ml,i ( j +1)) reference frames is 5, the UVLC entropy coding is used,
frame rate is 15f/s. Table Ⅳ shows that LRC and the
The target QPl ,i ( j + 1) of current basic unit can be
proposed method can both be closer to the target bit rate
computed by the transform coefficient distribution and (4).
Step 3: To reduce the blocking artifacts, the quantization and obtain significantly smaller MBEE than JM8.4 for
parameter is bounded by suize_trevor sequence at different target bit rate. Table Ⅴ
QPl ,i ( j + 1) = max{QPl −1,i ( j + 1) − 1, shows, compared with JM8.4 and LRC, that the proposed
(22)
min{QPl ,i ( j + 1), QPl −1,i ( j + 1) + 1}} method can not only improve the average PSNR values, but
Step 4: QPl ,i ( j + 1) is used to perform the RDO to code also reduce the standard deviations of PSNRs significantly
all the macroblocks of the current frame. for non-typical sequence suize_trevor. Fig.2 shows,
Step 5: Compute the coding bit count of the current basic compared with JM8.4 and LRC, that the proposed method
unit and update Tr ,i ( j + 1) and return to step 1 to code can significantly improve the quality of reconstructed
the following basic unit. pictures at the scene change.

5. Experimental results TABLE Ⅱ


COMPARISON OF BIT RATE AND MBEE ACHIEVED BY JM8.4, LRC AND PROPOSED
METHOD
The implementation of the unit layer linear rate control
algorithm(LRC) and the proposed method are based on Test Encoded bits(kbps) MBEE
sequence JM8.4 LRC Proposed JM8.4 LRC Proposed
JM8.4. We use two group QCIF test sequences as follows:
1) the typical sequences: container, silent, news, carphone container 127.943 127.947 127.910 4.622 4.397 4.359
and miss_am, and 2) the non-typical sequence: suize_trevor, silent 128.188 127.884 127.906 10.670 5.497 5.321
news 127.242 127.973 127.976 8.662 4.773 4.901
which is cascaded by suize and trevor sequences. Basic unit
carphone 128.148 127.921 127.967 6.809 4.455 3.950
is a macroblock row of 11 macrblocks. For the proposed miss_am 127.978 128.022 127.960 3.048 1.051 1.006
method, We adopt the fixed PID coefficients
( K p = 0.2 , K i = 0.1 , K d = 0.05 ) to cope with various TABLE Ⅲ
coding environments. The mean bit rate estimation COMPARISON OF AVERAGE PSNR FOR JM8.4, LRC AND PROPOSED METHOD
error(MBEE) is used to measure the accuracy of bitrate
estimation: Test Average PSNR(dB) Var in PSNR
sequence JM8.4 LRC Proposed JM8.4 LRC Proposed
1 N F −1 Ri ,t arg et − Ri ,encoded container 41.976 42.041 42.033 0.140 0.144 0.139
MBEE =
NF
∑ i =0 Ri ,t arg et
(23) silent
news
41.901 41.791 41.788
42.376 42.068 42.356
0.698 0.536
0.793 0.785
0.498
0.623
carphone 40.627 40.717 40.725 5.681 5.447 5.412
where N F is the total frame number, i is the index of miss_am 45.925 45.973 45.968 0.134 0.119 0.114
the coding frame, Ri ,t arg et and Ri ,encoded denote the
TABLE Ⅳ
target bit count and the actual coding bit count of the ith
COMPARISON OF BIT RATE AND MBEE BY JM8.4, LRC AND PROPOSED METHOD FOR
frame respectively. SUIZE-TREVOR SEQUENCE

For the typical sequence, the testing condition is as


Test Target Encoded bits(kbps) MBEE
follows: the length of GOP is set as 100, but 75 for sequence bitrate JM8.4 LRC Proposed JM8.4 LRC Proposed
192kbps191.502191.920 191.959 16.143 13.399 9.818
miss_am, the number of reference frames is 5, the UVLC
Suize- 128kbps128.160127.962 127.988 23.223 15.088 10.761
entropy coding is used, frame rate is 15f/s, target bit rate is trevor 64kbps 64.146 63.965 64.062 36.967 21.895 15.560
32kbps 32.141 31.986 32.198 68.010 33.100 24.695
128kbps. Table Ⅱ shows that LRC and the proposed
method can both not only be closer to the target bit rate but TABLE Ⅴ
also achieve significantly smaller MBEE than JM8.4. Table COMPARISON OF AVERAGE PSNR FOR JM8.4, LRC AND LRC+PID FOR SUIZE-TREVOR

36
Proceedings of the 2007 International Conference on Wavelet Analysis and Pattern Recognition, Beijing, China, 2-4 Nov. 2007

SEQUENCE FOR SUIZE-TREVOR SEQUENCE


7-14,March,2003.
[3] Z.G.Li, W.Gao, F.Pan, S.W.Ma, K.P.Lim, G.N.Feng,
Test Target Average PSNR(dB) Var in PSNR X.Lin, S.Rahardaj,Y.Lu, and H.Q.Lu, Adaptive rate
sequence bitrate JM8.4 LRC Proposed JM8.4 LRC Proposed
control with HRD consideration, JVT-H014, 8th
192kbps 42.186 42.194 42.207 1.876 1.837 1.783
Suize- 128kbps 39.894 39.857 39.920 2.638 2.525 2.490 meeting, Geneva, 20-26, May,2003.
trevor 64kbps 64.146 63.965 64.062 3.365 3.332 3.322 [4] Proposed draft of adaptive rate control, JVT-H017,
32kbps 32.141 31.986 32.198 3.728 3.512 3.307 8th meeting, Geneva, 20-26, May, 2003.
[5] Z.He, Y.K.Kim and S.K.Mitra, Low-delay rate
control for DCT video coding via ρ -domain source
46
modeling, IEEE Trans. Circuits System. Video
44 Technology, vol.11, pp.928-940, August 2001.
42
[6] Z.He and S.K.Mitra, A unified rate-distortion
analysis framework for transform coding, IEEE
40 Trans. Circuits System. Video Technology,
PSNR(dB)

38
vol.11,pp.1221-1236, December 2001.
[7] Y.Sun and I. Ahmad, A robust and adaptive rate
36 proposed method
control algorithm for object-based video coding,
LRC
34
JM8.4
IEEE Transaction on Circuits and Systems for Video
Technology, 2004,14(10): 1167-1182.
32
1 10 19 28 37 46 55 64 73 82 91 100 109 118 127 136 145 [8] C.W.Wong, O.C.Au and H.K.Lam, PID-based
Frame Number real-time rate control, IEEE International
Conference on Multimedia and Expo, 2004,1: 221 –
Fig.2 Frame by frame PSNR comparison for sequence “suize_trevor” at 224.
the target bit rate of 128kbps. [9] I.H.Shin, Y.L.Lee and H.W.Park, Rate control using
linear rate- ρ model for H.264, Signal Process.:
6. Conclusions Image Communication 19(2004) 341-352.
[10] Z.He and T.Chen,Linear rate control for JVT video
We have proposed a new unit layer rate control method coding , Information Technology: Research and
based on linear rate model and PID controller. A Education, 2003. Proceedings. pp. 65 – 68, 11-13,
scene-change handling method is used in the PID controller Aug 2003.
to deal with scene changes. The frame layer bit allocation is [11] H. S. Malvar, A. Hallapuro, M. Karczexicz and L.
realized by the PID controller, the unit bit allocation is Kerofsky, Low-complexity transform and
obtained from distribution of transform coefficients from quantization in H.264/AVC, IEEE Trans. Circuits
the pre-analysis process based on INTER_16×16 mode, System. Video Technology, vol.13, pp.598-603, July,
then the target quantization parameter can be calculated by 2003.
linear rate control model to code all macroblocks in current
unit.
Experimental results show that the proposed method
can not only, same as the LRC, attain the target bit ratemore
accurate and obtain significantly smaller PSNR variation
than that of JM8.4, but also improve temporal picture
quality while keeping high spatial picture quality and the
quality of reconstructed picture for scene change.

References

[1] T.Wiegand, G.J.Sullivan, G.Bjøntegaard and


A.Luthra, Overview of the H.264/AVC video coding
standard,IEEE Trans. Circuits System. Video
Technology,,vol.13, pp. 560-576, July 2003.
[2] Z.G.Li, F.Pan, K.P.Lim, G.N.Feng, X.Lin and
S.Rahardaj, Adaptive basic unit layer rate control for
JVT, JVT-G012, 7th meeting, Pattaya II, Thailand,

37