6 views

Uploaded by Mario Cordina

Adaptive frame level rate control

- Multimedia Systems Design by Prabhat k Andleigh Kiran Thakrar
- Compression 1
- Image Compression
- journal in image processing
- Trans Jour
- r05321103-bio-medical-signal-processing
- Note 2 - Source Coding Techniques
- 10.1.1.23
- Semester VII REV
- Lecture 8
- Sparse Transform Matrix at Low Complexity for Color Image Compression
- ATI Video Quality
- Huffman Coding
- Introduction to coding of audivisual contents
- Digital Image Processing Question Bank
- [IJCST-V5I3P13]:Sneha, Anshu Chopra
- Scene Change Detection Algorithms & Techniques: A Survey
- Sparse Representation through Multi-Resolution Transform for Image Coding
- Compression Tech
- Aim

You are on page 1of 4

Songchao Tan1, Junjun Si2, Siwei Ma2, Shanshe Wang3, Wen Gao2

1

School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China

2

Institute of Digital Media, Peking University, Beijing 100871, China

3

School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150080, China

control algorithm for the high efficiency video coding (HEVC)

based 3D video (3DV) compression standard. In the proposed

scheme, a new initial quantization parameter (QP) decision

scheme is provided, and the bit allocation for each view is

investigated to smooth the bitrate fluctuation and reach accurate

rate control. Meanwhile, a simplified complexity estimation

method for the extended view is introduced to reduce the

computational complexity while improves the coding

performance. The experimental results on 3DV test sequences

demonstrate that the proposed algorithm can achieve better R-D

performance and more accurate rate control compared to the

benchmark algorithms in HTM10.0. The maximum performance

improvement can be up to 12.4% and the average BD-rate gain

for each view is 5.2%, 6.5% and 6.6% respectively.

Index Terms rate control, initial QP, bits allocation, 3D

video coding

I. INTRODUCTION

With todays rapid development of 3D acquisition and

display technologies such as auto-stereoscopic 3D TV [1] and

free-viewpoint video [2], 3D video has gained increasing

interests recently. To better support various stereoscopic or

3D video applications, Motion Picture Experts Group (MPEG)

has started the 3D video (3DV) standardization efforts, which

are currently continued in the Joint Collaborative Team (JCT)

on 3D Video coding (JCT-3V). Both H.264/AVC and HEVC

[3] are utilized to develop 3DV standards by JCT-3V. And as

one of them, 3D-HEVC which targets at developing tools for

both texture views and depth views is based on HEVC.

The typical 3DV is stereo-view video which provides each

eye with one video separately at the same time. The disparity

between these two videos causes the illusion of depth

perception for human. And in 3D-HEVC, due to the need of a

more comprehensive view, it involves a more general case of

n-view multi-view video. By using the depth image-based

rendering (DBIR) technique, we can get large view-angle

scene with fewer amounts of data [4].

Rate Control (RC) is employed in video coding to regulate

the bitrate meanwhile guarantee good video quality. As for

3D-HEVC, it becomes more complicated because multiple

views are involved in coding and within each view there are

two kinds of video sequences (i.e. texture and depth map).

Two RC algorithms Unified Rate Quantization (URQ) [5] and

R-lambda [6] are proposed for 3D-HEVC up to now. In [7],

an inter-view mean absolute difference (MAD) prediction

based on the depth map is introduced to improve the accuracy

each view is not suitable for 3D-HEVC, the performance of

PSNR, the dramatic fluctuation of generated bits and bits error

are unsatisfactory. Aiming at solving the problems of the

existing algorithms, a more suitable bit allocation scheme

which gives full consideration to the particularity of the new

video coding standard is used in the proposed RC algorithm.

As an important part of RC algorithm, a new initial

quantization parameter (QP) decision scheme is proposed for

3D-HEVC. Through the introduction of the videos

characteristic, our initial QP decision scheme can provide a

more smooth quality and less bitrate fluctuation. At the same

time, in order to achieve a higher quality and higher coding

efficiency, we improve the R-Q model by introducing interview complexity estimation.

The rest of this paper is organized as follows. In Section II,

the algorithm is briefly described. In II-A, the decision of

initial QP is proposed. The bit allocation for base and

extended view is investigated in II-B. And in II-C, inter-view

complexity estimation is designed for the proposed R-D

model. In Section III, the experimental results are given to

demonstrate the efficiency of the proposed RC algorithm.

Finally, Section IV concludes this paper.

II. PROPOSED RATE CONTROL ALGORITHM FOR HEVC-3DV

A. The decision of Initial QP

The initial QP decision is important for video coding. Bit

per pixel (bpp) based initial QP decision is adopted as

illustrated in (1).

bpp = bitrate

w h f

(1)

where bpp refer to the bit per pixel, w and h are the width and

height of the sequence. f is the frame rate of the video.

However, the bpp-based initial QP strategy widely used in

RC algorithm does not work well in 3D-HEVC. Because bpp

can just reflect the bitrate, but the characteristic of video is

ignored. For instance, for the sequence Poznan_Hall2 and

Undo Dancer, as shown in Table I, the bpp of sequence

Poznan_Hall2 when the anchors initial QP is 25 is same as

the bpp of sequence Undo_Dancer when the anchors initial

QP is 35. Obviously, by utilizing bpp-based initial QP scheme,

the initial QP will be too large for sequence Poznan_Hall2.

In this paper, by a lot of experimental analysis we propose

an improved scheme for the decision of initial QP based on

bpp per variance of gradient (BPVG) as follows.

382

BPVG

= Bpp 105 / VG

VG = D(gradient)

(2)

first frame.

As shown in Table I, due to the characteristic of video is

considered reasonably, the proposed initial QP can solve the

above mentioned problem well.

TABLE I INITIAL QP FOR DIFFERENT SEQUENCE

Class

Sequence

Anchor

QP

Kbps

bpp

bpp based

Initial

QP

Proposed

Initial

QP

25

810.69

0.0155

35

25

30

344.89

0.0066

40

30

35

177.24

0.0034

47

35

40

98.51

0.0019

49

40

25

4626.44

0.0923

25

25

30

2005.90

0.0384

30

30

35

916.22

0.0175

35

35

40

426.33

0.0082

40

40

research and find that due to the particularity of I-SLICE

which means the PSNR of a frame have no relation to the

frame before this I-SLICE and in a GOP there is only one ISLICE. Based on that, we proposed the bit allocation which

based on GOP.

As shown in Fig 1, the target bits for a frame is calculated

as follow (4),

Poznan_Hall2

targetn ,i

1920*1088

1

= bpf GOP wi ,

n [ 2, m 1] (4)

m

bpf ( frame _ last ) wi , n =

Undo_Dancer

bpf = bitrate

In this section, we first briefly review the bit allocation in

video coding and then detail our proposed method. Bit

allocation plays a key role in video coding. In transform

coding, bit allocation is performed by allocating the bits

budget to different groups of transformed coefficients to

minimize the overall quantization distortion. In [8], Huang

first presents the optimal bit allocation problem. Optimal bit

allocation is incorporated with rate control scheme in video

coding to improve the coding quality.

In [9], the bit allocation is designed to allocate the bits

weighted averagely for each frame and for I-SLICE the target

bits are multiplied by five because the quality of I-SLICE has

a great impact on the next few frames. For the Low Delay (LD)

configuration, the scheme works well, but in the Random

Access (RA) configuration, the performance and the bitrate

error are unsatisfactory. Through the analysis of the bit

allocation, we find the reason is that the scheme is based on

rate gop [10] (in RA, a rate gop is composed of eight frames)

as,

=

Tgop

(R F ) N

framerate

(5)

where targetn,i is the target bits for nth GOP, ith frame. wi is

the weight of different frame. frame_last is the number of the

last GOP.

When overflow or underflow occurs, the difference

between target bits and actual bits in a GOP will be distributed

equally to the rest waiting for coding GOP. At the same time,

the video buffer verifier (VBV) operation model is established

to smooth quality fluctuation.

C. Proposed R-Q Model

The trade-off between the output bitrate (R) and the quality

(D) of compressed video is determined by quantization step

size (Qs), which is indexed by quantization parameter (Q). The

R-Q model has been studied extensively for previous video

coding standards such as H.264/AVC and H.265/HEVC.

Based on experimental analysis, we propose the R-Q model as,

R= X / QP

(6)

complexity estimation for the current picture. QP is the

quantization parameter. X is computed as:

(3)

1

n 1

n

where R is the target bitrate, F is the frame rate of the X= ( wi SADi ) / ( wi SADi ) Rn 1 QPn 1

(7)

=

i 0=i 0

when a frame is I-SLICE, the target bits for it are multiplied

by five. But the extra bits cost by I-SLICE are just managed n is the current frame number. QPn-1 is the quantization

by overflow/underflow handling strategy instead of parameter of the (n-1) th frame. Rn-1 is the actual bits of the (n1) th frame. is a constant, the recommended value is 0.6. wi

considered carefully. That explains why the scheme can work

is the weight of SAD values of previously encoded frames.

well just in LD which has only one I-SLICE.

Due to the relationship between extended view and base

view as shown in Fig 2, we can see that there is strong

383

50

45

view 1

40

view 2

35

view 0

30

25

20

15

10

5

0

0

10

20

30

40

50

60

70

frame numbers

Fig. 2 The complexity (SAD) between view 0, view 1 and view 2

targetn

,

target0

(n =

1, 2)

(8)

the target bits of view n.

Rtarget

(9)

100%

where Error is the bit error, Ractual is the total bits used to

encode the depth map and texture map of three views; Rtarget is

the target bits of that.

The errors of proposed, URQ model and R-lambda model

are presented in Table II. We can see that the proposed RC

scheme is accurate enough that the mismatch is less than 1.0%

on overage.

Fig. 3 shows the R-D curves of the two algorithms for

sequences Poznan_Hall2 and GT_Fly. From that, we

can see that the proposed scheme can achieve a larger PSNR

value at each target bit rate for each of the four sequences than

the R-lambda scheme.

The experiments are conducted under 5-view scenario,

where 3 views are coded, and 2 virtual views are synthesized.

The testing sequences include Kendo, Balloons,

Newspaper,

GT_Fly,

Poznan_Hall2,

Poznan_Street, Undo_Dancer and Shark. The texture

and depth map of three views are separately encoded with

HTM10.0 encoder started as I-view, P-view and P-view

respectively. For P-frame, the interview prediction is involved.

300 frames are encoded for each view.

Poznan_Hall2(view 1)

PSNR (dB)

Poznan_Hall2 (view 0)

42.5

42

41.5

41

40.5

39.5

R-lambda

38.5

Proposed

40

39

R-lambda

38

Proposed

37

37.5

0

200

400

600

800

100

300

GT_fly (view 1)

40.5

40

38.5

38

36.5

200

GT_fly (view 0)

PSNR (dB)

In order to evaluate the performance of the proposed RC

algorithm, R-lambda algorithm proposed in [6] is utilized for

comparison. First, the R-D curve between proposed and Rlambda is shown in Fig.3. Then the performance compare

with RC in the latest reference software HTM10.0 will be

shown in Table II.

To evaluate the accuracy of the bit rate achievement, the

following measurement is adopted.

| Ractual - Rtarget |

PSNR (dB)

SADviewn =

SADview0

=

Error

PSNR (dB)

experimental analysis, the complexity of extended views

frame can accurate replaced by base views frame which just

finished coding in the same GOP after appropriate scaling. It

should be noted that I-SLICE should consider independently,

because extended views frame has no I-SLICE. Based on

statistical analysis, we proposed a linear model to estimate the

complexity of the dependent views as follow (8),

R-lambda

36

R-lambda

34

34.5

Proposed

Proposed

32

32.5

0

1000

2000

3000

4000

100

200

300

400

500

600

Fig. 3 The rate distortion curves of the proposed rate control scheme as well

as the rate control in reference software HTM10.0 (R-lambda).

384

Proposed vs R-lambda

Proposed vs URQ

Bitrate Error

R-lambda

Proposed

bitrate

bitrate

error

error

Video 0

Video 1

Video 2

Video

PSNR

Video 0

Video 1

Video 2

Video

PSNR

URQ

bitrate

error

Balloons

-3.2%

-4.5%

-5.2%

-3.9%

-25.3%

-45.0%

-43.4%

-33.0%

0.59%

1.47%

0.26%

Kendo

-4.3%

-4.8%

-5.3%

-4.3%

-33.0%

-50.2%

-50.0%

-39.9%

0.82%

0.58%

0.33%

Newspaper_CC

-1.8%

-2.0%

-2.8%

-2.5%

-27.6%

-55.2%

-42.2%

-37.2%

1.83%

2.61%

0.49%

GT_Fly

-6.8%

-12.5%

-12.3%

-8.4%

-31.1%

-52.4%

-51.8%

-36.4%

0.64%

1.46%

1.58%

Pozan_Hall2

-12.4%

-12.2%

-12.1%

-12.2%

-14.5%

-30.9%

-32.4%

-21.9%

0.69%

2.88%

0.44%

Poznan_Street

-3.9%

-5.2%

-4.3%

-4.1%

-33.4%

-48.7%

-69.4%

-44.7%

0.32%

6.01%

3.53%

Undo_Dancer

-6.6%

-6.1%

-6.5%

-6.3%

-24.8%

-42.2%

-40.9%

-30.4%

0.33%

0.98%

0.31%

Shark

-2.8%

-4.4%

-3.8%

-3.6%

-12.1%

-16.8%

-14.3%

-14.4%

0.51%

1.43%

0.40%

1024*768

-3.1%

-3.7%

-4.4%

-3.6%

-28.7%

-50.1%

-45.2%

-36.7%

1.08%

1.55%

0.36%

1920*1088

-6.5%

-8.1%

-7.8%

-6.9%

-23.2%

-38.2%

-41.7%

-29.6%

0.50%

2.55%

1.25%

average

-5.2%

-6.5%

-6.6%

-5.7%

-25.2%

-42.7%

-43.0%

-32.3%

0.72%

2.18%

0.92%

In this section, we verify the effectiveness of the proposed

bits allocation scheme in Section II-B. Because of the quality

is unacceptable comparing with the other two algorithms. We

choose the sequence Kendo, initial QP 25, the generated bits

curve between proposed RC algorithm and R-lambda

algorithm in HTM10.0. As shown below in Fig. 4, R-lambda

model occur bit fluctuation more severe, which can make

some transmission delay. Thus, for strict real-time

applications proposed scheme is more suitable than R-lambda

model.

bits *103

REFERENCES

[1]

[2]

[3]

R-lambda

160

ACKNOWLEDGEMENT

This work was supported in part by the National High-tech

R&D Program of China (863 Program, 2012AA010805)

National Science Foundation of China (61322106, 61390515),

which are gratefully acknowledged.

Proposed

120

[4]

80

40

[5]

0

150

160

170

180

190

200

210

220

230

240

250

frame number

[6]

[7]

IV. CONCLUSIONS

In this paper, we proposed a frame level RC algorithm to

achieve the better quality for 3D-HEVC. Based on our models

for the R-Qs and the bits allocation scheme, an R-D optimized

RC algorithm is derived. Experiments are conducted on

different video sequences and the results demonstrate the

effectiveness of the proposed algorithm.

[8]

[9]

[10]

385

Lipschitz-Hankel type involving products of Bessel functions, Phil.

Trans. Roy. Soc. London, vol. A247, pp. 529-551, Apr. 1955.

3D-HEVC Test Model Description draft 1, JCT on 3D Video Coding

Extension Development of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC

29/WG 11/JCT2-A1005, Sep. 2012.

B. Bross, W.J. Han, J. Ohm, G. Sullivan and T. Wiegand, High

efficiency video coding (HEVC) text specification draft 8, JCTVCJ1003, Joint Collaborative Team on Video Coding (JCT-VC) of ITUTSG16 WP3 and ISO/IEC JTC1/SC29/WG11, JCTVC-J1003_d2.doc,

10th Meeting: Stockholm, Sweden, 11-20 July, 2012.

Siwei Ma, Shiqi Wang, and Wen Gao, Zero-Synthesis View

Difference Aware View Synthesis Optimization For HEVC Based 3D

Video Compression, Visual Communication and Image Processing,

San Diego, USA, Nov.2012.

H. Choi, J. Nam, J. Yoo, D. Sim, and I. V. Baji, Rate control based

on unified RQ model for HEVC, JCT-VC of ITU-T SG16 WP3 and

ISO/IEC JTC1/SC29/WG11, JCT-VC H0213 (m23088), San Jos, CA,

USA, Feb. 2012.

B. Li, H. Li, L. Li, and J. Zhang, Rate control by R-lambda model for

HEVC, ITU-T SG16 Contribution, JCTVC-K0103, Shanghai, pp.1-5,

Oct. 2012.

W. Lim, D. Sim and I. V. Baji, JCT3V Improvement of the rate

control for 3D multi-view video coding, ISO/IEC JTC1/SC29/WG11,

JCT3V-C0090, Geneva, Switzerland, Jan. 2013.

J. J. Huang and P. M. Schultheiss, Block quantization of correlated

Gaussian random variables, IEEE Trans. Commun. Syst., vol. CS-11,

pp. 289296, Sept. 1963.

J. Si, S. Ma, W. Gao, JCT-VC-Adaptive rate control for HEVC

ISO/IEC JTC1/SC29/WG119, JCTVC-I0433, Geneva, CH, 27 April-7

May 2012.

S. Wang, S. Ma, S. Wang, D. Zhao, and W. Gao, Rate-GOP Based

Rate Control for High Efficiency Video Coding, IEEE Journal of

Selected Topics in Signal Processing, vol.7, no.6, pp.1101-1111,

Dec.2013.

- Multimedia Systems Design by Prabhat k Andleigh Kiran ThakrarUploaded bytejas
- Compression 1Uploaded byHari Harul Vullangi
- Image CompressionUploaded byamt12202
- journal in image processingUploaded byPrabha Karan
- Trans JourUploaded byMarvin Ngwenya
- r05321103-bio-medical-signal-processingUploaded bySRINIVASA RAO GANTA
- Note 2 - Source Coding TechniquesUploaded byAakash Kapoor
- 10.1.1.23Uploaded byMouncef Cherqaoui
- Semester VII REVUploaded bydeepak sharma
- Lecture 8Uploaded byhemantsingh
- Sparse Transform Matrix at Low Complexity for Color Image CompressionUploaded byseventhsensegroup
- ATI Video QualityUploaded byjmmayod
- Huffman CodingUploaded bySanyam Jain
- Introduction to coding of audivisual contentsUploaded byJosep Carner
- Digital Image Processing Question BankUploaded bySantosh Sanjeev
- [IJCST-V5I3P13]:Sneha, Anshu ChopraUploaded byEighthSenseGroup
- Scene Change Detection Algorithms & Techniques: A SurveyUploaded byijcsis
- Sparse Representation through Multi-Resolution Transform for Image CodingUploaded byijcsn
- Compression TechUploaded byNavin Mis
- AimUploaded byHarinder Grewal
- A STUDY OF VARIOUS IMAGE COMPRESSION TECHNIQUESUploaded byjadolyo
- 04711552Uploaded bynarendravarma
- IRJET- An Image Encryption-then-Compression using Clustering and PermutationUploaded byIRJET Journal
- A New Algorithm for Data Compression OptimizationUploaded byadadawuhehe_42335869
- 1-s2.0-S1877050915032019-main.en.frUploaded bySylvain Kamdem
- 05Questions Banks DIPUploaded byvijaya
- Class 1Uploaded byKandasamy Vadivel
- Optic Flow Sensor DesignUploaded byUsman Habib
- Audio Video Animation-5Uploaded byAman Pandey
- LZW_ALGOUploaded byVishal Srivastava

- 04399963Uploaded byMario Cordina
- JCT3V-G1100 (CTC)Uploaded byMario Cordina
- JCT3V-G1100 (CTC)Uploaded byMario Cordina
- 251_CR_20130809075053Uploaded byMario Cordina
- 05157039Uploaded byMario Cordina
- 05959967Uploaded byMario Cordina
- Acuvim-L ManualUploaded byWarong Natdurong
- GT_HEVCUploaded byMario Cordina
- 631-2361-1-PB.pdfUploaded byMario Cordina
- 2013 ChinaSIP Lee Chan SiuUploaded byMario Cordina

- LOPAUploaded bydroffilcz27
- Cyber Pre-crime executive order?Uploaded byJeff Garland
- IMS200 SurveillanceSystem User Manual-20130717Uploaded bySaad Abdelmalek
- T2000 CatalogUploaded byapi-3749499
- MaintenanceUploaded byKumar Ganesh
- Rowe Wallbox CD WB • PhonoLandUploaded byFrankcw1
- sem2Uploaded byShravan Kumar
- GSM 3G LTE RF Optimization Course_v1 (4)Uploaded byAman
- Kronos 4500 TrainingUploaded bycswiger
- ANNEX 9 Project Work Plan and Budget Matrix-noliUploaded byNoli Edubalad
- 1. Introduction to ITILUploaded bysachinkoenig
- DigitalLineProtection GE DlpdUploaded byMarcWorld
- B+v Manual - Power Slip, Hydraulic Op. PS 500 - Rev007 Valid From SN 45561-August 2008Uploaded by施咏胜
- VP Product Management Chief Technology Officer In Seattle WA Resume Scott ToborgUploaded byScottToborg
- Symfony Cookbook 2.1Uploaded byRashid Laanaya
- ISEF Based Identification of RCT/Filling in Dental Caries of Decayed ToothUploaded byAI Coordinator - CSC Journals
- K3HB Communication User Manual (3)Uploaded byFauzan Ismail
- EdelweissUploaded byjvr001
- basics-kanban-intro-140206042027-phpapp01.pdfUploaded byMarcelo Aliberti
- APA Referencing p.3 11Uploaded byJen Alvarade
- PhD Research ProposalUploaded byshaheerdurrani
- rp_e_hcm_aweUploaded byGanesh.am
- Complaint Letter ActivitiesUploaded byAgnesmagnes
- L-1 Introduction to GambitUploaded byraharjoitb
- Introduction to Z transformUploaded byiossifides
- Sap Error Code 2Uploaded byPajo Patak
- Fingerprint Sensor: One Touch AwayUploaded byJosephine Lim
- Oracle E-Business Daily Business Intelligence Implementation GuideUploaded byporus
- 315915214-FortiGate-III-Student-Guide-Online.pdfUploaded byhaha
- Facebook Guide for EducatorsUploaded bykgioto