Professional Documents
Culture Documents
Abstract—Virtually Reality (VR) technology, especially 360 connection speed was only 8.7 Mbps in 2017 and will be 20.4
VR videos, is currently a hot topic thanks to its immersive Mbps in 2021 [6].
experience, compared to traditional multimedia applications, Recently, a new version of HTTP protocol was proposed
such as allowing users to experience a close real life by panoramic
view. Nevertheless, 360 VR video transmission consumes a huge in May 2015, called HTTP/2, intended as a higher perfor-
bandwidth. Besides, how to reduce the delay of 360 VR video mance alternative to HTTP/1.1 [7]. It introduces new features
streaming is another problem. To address these two challenges, including server push, stream priority, and stream termination.
in this paper, we introduce an efficient adaptive 360 VR video Server push enables the client to receive multiple objects by
streaming method over HTTP/2 using stream priority and stream sending only one request. The stream priority feature allows
termination features. To support adaptivity, a 360 VR video is
divided into multiple faces and each face is chunked into temporal the client to request the server to spend more resources to push
segments. The video is also stored at the server with different data in one stream rather than others. Also the streams can be
levels of quality. In our method, we focus on the low delay terminated if the client sends a RST STREAM frame to enable
scenario and thus all faces are downloaded simultaneously by stream termination feature.
using priority feature of HTTP/2 and the client decodes them Although 360 VR video is getting much attention in the
based on the Group of Pictures (GOP). If the bandwidth suddenly
drops, the client will break the simultaneity and inform the research community, streaming of this state-of-the-art video
server to push consecutively the faces based on their priorities. is now deployed by simply sequentially streaming tiles in a
In addition, stream termination feature can be used to terminate scene after choosing bitrate for them [14] [15] [17]. Most of
the media parts, which will miss the deadline. The experimental the current approaches are based on HTTP/1 and have not been
results show that, when the initial buffer is set to 0.5s, there is adequately evaluated in the literature. Among existing studies,
only 0.08% of the number of GOPs in the front face that miss
the deadline in our proposed method, compared to the values of our previous study in [9] is the first one that exploits the use
0.75% and 3.34% of two reference methods. of new features of HTTP/2. In this study, the tiles in the same
Index Terms—VR video streaming, HTTP/2, Virtual reality, region are downloaded sequentially, so the client needs to wait
HTTP Adaptive Streaming, stream priority, stream termination. for all those tiles to render the whole region, which results in
high initial delay.
I. I NTRODUCTION In this paper, a new adaptive streaming method over HTTP/2
is proposed to tackle the aforementioned issues. We focus
Recent years has been witnessing a remarkable rise in on improving the QoE. The QoE can be estimated based on
the commercial progress of VR technology. It reached USD the initial delay, mean rebuffering duration, and rebuffering
1.37 billion in 2015 and is expected to grow to USD 33.90 frequency [24]. To decrease the initial delay, the tiles are
billion in 2022 [1]. People now can use head-mounted displays encoded by closed GOPs. Using the stream multiplexing and
(HMDs) such as Samsung Gear VR, HTC Vive, and Facebook stream priority features, we guarantee that the same-index
Oculus and then freely adjust their head orientation to watch GOPs of all tiles are received at the client at the same
panoramic views on the 360 VR videos. Based on the user’s time. We also use priority and termination features to cope
orientation and the field of view (FoV), those HMDs display with bandwidth fluctuation for the purpose of minimizing
the visible area on the screen. The FoV now is limited around rebuffering frequency.
to 90o both horizontally and vertically [2]. In our method, the general process to request the media part
A major challenge in VR video streaming is its high is as follows. First, the client chooses the bitrate for each tile
bandwidth requirement. 360 VR videos need encoding at least based on the relative position and the FoV. Second, the client
at 4K resolution to meet the acceptable quality of viewing sends multiple requests at one time to receive simultaneously
experience [3]. The reason is that while the whole video is in tiles from the server. The stream priority is used to control
4K, the FoV is only one-fifth or sixth of that resolution and the the time each tile arrives at the client. This feature enables the
users just watch nearly 720P video content. As a consequence, client not only to set the expected resource ratio to download
such streaming requires four to five times bandwidth compared concurrently each tile but also to receive a tile after completely
to the traditional one [4]. According to Netflix [5], in order downloading another. In the case of the sudden bandwidth
to stream videos in 4K resolution, the download speed should drop, the client possibly uses a RST STREAM frame to request
achieve at least 25 Mbps. However, the global mobile network the server to stop sending the late tiles. Experimental results
in variable bandwidth conditions show that our method is strategy, the tiles outside of the viewport are assigned the
effective with less number of miss-deadline GOPs than other lowest quality level and the rest the highest possible quality.
reference methods when the bandwidth dramatically decreases. However, each request is used to download only one single
The rest of the paper is organized as follows. In Section II, tile which easily results in the network overhead because
we give an overview of video streaming over HTTP/2 and the there are a large number of tiles in a panoramic view. A live
related work. Section III presents our proposed method. The streaming system for Omnidirectional Video is introduced by
experimental results and discussions are given in Section IV. Ochi et al. [17] in which two tiles covering the whole view
Finally, Section V concludes this paper. are requested. A so-called interesting tile covering the FoV is
requested by the client with higher quality version whereas an
II. R ELATED WORK image tile covering the whole panoramic view is automatically
A. 360 VR streaming pushed to the client with a lower quality. This approach can
Recently, 360 VR tile-based streaming is considered as tackle the network overhead by using only one request, but
a prospective solution to the problem of high bandwidth downloading only one interesting tile for the viewport may still
requirement, in which the videos are divided into several consume much bandwidth budget. Mariem Ben Yahia et al. in
tiles by HEVC/H.265 standard [12] [13]. In order to support [25] propose an approach using the priority and termination
adaptive streaming, every tile is encoded at many quality levels features of HTTP/2 to request video data at the frame level.
which are partially split into multiple segments. It should In their study, the priority feature is used to arrange the video
be noted that the tiles covering a panoramic view can be frames following the GOP structure. They also increase the
encoded and decoded independently. Fig. 1 illustrates the priority weights of the most important video frames, so it helps
architecture of a 360 VR tile-based streaming system. Based increase the probability that these frames could be decoded and
on the predicted viewport and bitrate adaptation method, the rendered on time. Besides, by using the termination feature,
client sends requests to the server, receives tiles, and stores the client is able to terminate frames which is less important
them in the buffer for decoding. or has no chance to arrive on time. However, they do not give
8/3/2018 draw.io the detailed algorithm about termination and priority weight
assignment. The use of each HTTP/2 stream for delivery only
one video frame might lead to network overhead because there
are a lot of streams needed to open per second.
In this paper, we focus on the delivery of the 360 VR video
Encoding & Decoding & using multiplexing, stream priority, and stream termination
Packaging Display
features of HTTP/2, rather than the viewport prediction and
D pr
iff es
re
er e
en nt
t q ati
chrome-extension://pebppomjfocnoigkeepgbmcifnnlndla/index.html 1/1
client or the server is able to terminate the referenced stream. client at the same time, we set the priority weight of each
However, the sending peer must receive additional DATA HTTP/2 stream of tiles to satisfy the condition: the weight
frames of terminated stream sent by remote peer before the parameters of tiles are proportional to their bitrates. However,
arrival of the RST STREAM frame. in decrease case, all the tiles of (K − 1)th region are assigned
the priority weight of 1. The priority parameters allocation
III. P ROPOSED METHOD
algorithm is presented in Algorithm 1.
A. Priority parameters allocation
The set priority() function is used to set the priority param-
eters for each request. There are two parameters needed to be
TABLE I: Notations used in this paper
assigned which are stream dependency, and priority weight
Symbol Description with their meaning mentioned in Section 2. In Algorithm 1,
SD Segment duration we set the stream dependency of all streams as the stream id
GD Total duration of all frames in a GOP
βinit Initial buffer of an idle stream which is not used to transport any user data.
Ri Region ith of video view
NiR The number of tiles in region ith
K The number of regions B. Reprioritization and Stream termination
τkj Tile j th of region kth
wi Priority weight of region ith When the client is downloading a segment, the sudden band-
ri Rate of region ith width drop is unpredictable. This might result in rebuffering
Tiest Estimated throughput of segment ith
Tis Smoothed throughput of segment ith
because there are a lot of tiles to be transported concurrently
Tinst Instant throughput and we could not guarantee that all GOPs will be decoded
tinst Instant download time of the current segment before the deadline in a poor network condition. To cope with
Dinst Instant data size downloaded of the current segment
this problem, the more important tiles will be downloaded
dGOP (τkj ) GOP data length of tile τkj of the current segment
before the remaining tiles and they could get entire network
In our proposed method, a video view will be divided into resources. In the worst case, some GOPs of important tiles
K regions: {Ri | i = 1, 2, ..., K}. The tiles in a region have could be received at the client to display. We divide tiles into
the same bitrate, and the smaller i value is, the higher the two sets denoted as Ω1 and Ω2 . Ω2 is a set of tiles which
importance of the region is. The rate allocation algorithm must be temporarily frozen until all of the tiles in Ω1 are
used in this paper is the same as our previous work in [9]. downloaded completely. In the case where wK−1 = 1, we set
We consider two cases to decide the estimated throughput: the initial value of Ω2 as the whole (K − 1)th region. When
increase case and decrease case. In the first case, when a segment is being transported, a tile in Ω1 is able to move
the bandwidth tends to grow, the estimated throughput is to Ω2 if the condition |Ω1 | ≥ N0R is met after the movement.
calculated by smoothed throughput method [10] [11] as (1) With each switched tile, a PRIORITY frame will be sent by
to avoid the short-term fluctuations of the network. In the the client with the stream dependency being set to the stream
decrease case, to cope with the decrease of the bandwidth, the id of any tile in R0 .
throughput of the last segment is considered as the estimated
throughput. Algorithm 2 Change stream dependency
Dinst
(
s
s
(1 − γ) ∗ Ti−1 + γ ∗ Tiavg if i > 0. 1: Tinst ← ;
Ti = (1) tinst
Tiavg otherwise. 2: while true do
3: if |Ω1 | ≤ N0R then
Algorithm 1 Priority parameters allocation 4: return;
1: for k = 0 → K − 1 do 5: Nmiss ← 0;
2: if k == 0 then 6: for each τkj ∈ Ω1 do
SD
3: wk ← 256; 7: for g 0 = g + 1 → d e do
GD j
4: else 8: e 0
calculate tGOP (i, g , τk );
inst
5: if Ti−1 > Tiinst and k == K − 1 then 9: if teGOP (i, g 0 , τkj ) > tdGOP (i, g 0 , τkj ) then
6: wk ← 1; 10: Nmiss ← Nmiss + 1;
7: else SD
rk 11: if Nmiss > α × |Ω1 | × d e then
8: wk ← d256 ∗ e; GD
r0 j
12: select τk ;
9: for j = 0 → NkR − 1 do 13: Ω1 ← Ω1 \{τkj };
10: set priority(stream dependency, wk ); 14: Ω2 ← Ω2 ∪ {τkj };
11: send a request for tile τkj ;
15: send a PRIORITY frame for tile τkj ;
16: else
Table I provides some notations used in the following
17: return;
discussion. To guarantee that all the tiles will arrive at the
We denote taGOP (i, g, τkj ) and tdGOP (i, g, τkj ) as the arrived
time and display deadline of g th GOP of tile τkj of ith segment,
respectively. A GOP is only decoded and displayed if:
taGOP (i, g, τkj ) ≤ tdGOP (i, g, τkj ). (2)
with
tdGOP (i, g, τkj ) = βinit + SD ∗ i + GD ∗ g. (3)
After receiving g th GOP of ith segment completely, the Time (s)
Time (s)
Time (s)
Time (s) Fig. 6: The average ratio of number of GOPs in other faces
to the front face
(b) priority method. For the accuracy, each method is run 10 times and the result
is the average value. Fig. 6 illustrates the average ratio between
the number of GOPs of faces (1-5) and the number of GOPs
of face 0 at the client which is counted at each display time.
It should be noted that we only consider the cases where the
number of GOPs in face 0 is more than 0. In general, this
ratio in none priority method is higher than others and almost
exceed 200%, especially the ratio of face 5 is around 315%,
that causes bandwidth waste. In contrast, two methods using
Time (s)
the priority feature obtain a better result with around 100%
for faces from 2 to 4 and 70% for face 5. Therefore, the client
can decode and render the GOPs before the deadline and does
(c) priority & termination method. not waste bandwidth for downloading unimportant GOPs.
Fig. 4: The number of GOPs in each face of three methods In this paper, we also consider the effect of the initial buffer
to the results of our method. The experiments of three methods
Fig. 5 compares the number of GOPs in face 0 received by with three different initial buffer values: 0.5s, 1s, 2s is also
the client in three cases. It can be seen that when the bandwidth performed. Fig. 7 shows the ratio between the number of
falls, there is a significant decrease in the number of GOPs missed GOPs of face 0 and the total number of GOPs of
(from 7 GOPs at 71s to 0 GOP at 75s) if the client does not the whole video. It is clear that the termination and priority
use the priority feature. On the contrary, the other two methods features bring the impressive results. The missed GOPs ratio
give better and similar results in which the figures keep nearly of none priority method is much higher than that of the other
unchanged around 8 GOPs and decrease to 4 GOPs at 75s. methods. These ratios in priority method are 0.06%, 0.42%
In addition, in the priority method, there is not any GOP of and 0.75% for the initial buffer 0.5s, 1s, 2s, respectively. In
face 0 for 3 times at 76.069s, 76.264, and 76.431s. And there priority & termination method, this ratio is approximately 0
[5] Help Center. (2017), “Internet Connnection Speed Recommendations,”
Available at https://help.netflix.com/en/node/306.
[6] Statista, (2018), “Average global mobile network connection speeds from
2016 to 2021 (in Mbps),” Available at https://www.statista.com/statistics/
371894/average-speed-global-mobile-connection/.
[7] M. Belshe, R. Peon and M. Thomson, Hypertext Transfer Protocol
Version 2 (HTTP/2), RFC 7540, May 2015.
[9] Minh Nguyen, Dang H Nguyen, Cuong T Pham, Nam Pham Ngoc, Duc
V. Nguyen, and Truong Cong Thang, “An Adaptive Streaming Method
of 360 Videos over HTTP/2 Protocol,” In NAFOSTED Conference on
Information and Computer Science, 2017
[10] T. C. Thang, Q.-D. Ho, J. W. Kang, and A. Pham, “Adaptive streaming of
audiovisual content using MPEG DASH,” Consumer Electronics, IEEE
Transactions on, vol. 58, no. 1, pp. 7885, February 2012.
Fig. 7: The percentage of miss-deadline GOPs in the front [11] S. Akhshabi, S. Narayanaswamya, A. C. Begen, and C. Dovrolisa,
face of three methods with different initial buffer values “An experimental evaluation of rate-adaptive video players over HTTP,”
Signal Processing: Image Communication, vol. 27, no. 4, pp. 271287,
April 2012.
[12] M. Hosseini, V. Swaminathan, “Adaptive 360 VR video streaming:
for all initial buffer values. This advantage is crucial for low Divide and conquer!,” in Proc. ISM2016,San Jose, CA, USA, 2016.
delay context. [13] P. Rondao Alface, J.-F. Macq, and N. Verzijp, “Interactive omnidi-
rectional video delivery: A bandwidth-effective approach,” Bell Labs
V. C ONCLUSION Technical Journal, vol. 16, no. 4, pp. 135-147, 2012.
[14] S. Petrangeli, F. De Turck, V. Swaminathan, and M. Hosseini, “Improv-
In this paper, we have proposed a new tile-based streaming ing Virtual Reality Streaming using HTTP/2,” in Proc. of the 8th ACM
for 360 VR video, where tiles are downloaded concurrently, on Multimedia Systems Conference. ACM, 2017.
that enables the client to decode and display video as long as [15] J. Le Feuvre, C. Concolato, “Tiled-based adaptive streaming using
MPEG-DASH,” in Proc. of the 7th International Conference on Mul-
at least a certain number of GOPs of all tiles are received. timedia Systems ser. MMSys 16, pp. 41: 1-41:3, 2016.
In addition, a new streaming system with low buffer helps to [16] M. Graf, C. Timmerer, and C. Mueller, “Towards Bandwidth Efficient
minimize initial delay when the priority and stream termina- Adaptive Streaming of Omnidirectional Video over HTTP: Design, Im-
plementation, and Evaluation,” in Proc. of the 8th ACM on Multimedia
tion features of HTTP/2 are deployed in the client. The priority Systems Conference. ACM, 2017.
weight assignment for each tile based on their bitrates enable [17] D. Ochi, Y. Kunita, A. Kameda, A. Kojima, S. Iwaki, “Live Streaming
the GOPs of all tiles to be received and decoded concurrently. System for Omnidirectional Video,” in Proc. of IEEE Virtual Reality
To cope with bandwidth fluctuation while streaming, stream [18] (VR), 2015.
H. T. Le, T. Vu, N. P. Ngoc, A. T. Pham, and T. C. Thang, “Seamless
dependency can be changed by sending PRIORITY frames mobile video streaming over HTTP/2 with gradual quality transitions,”
and the highly potential missed-deadline GOPs are terminated IEICE Transactions on Communications, vol. 100, no. 5, pp. 901909,
2017.
when the client sends RST STREAM frames. The experimental
[19] D. V. Nguyen, H. T. Le, P. N. Nam, A. T. Pham, T. C. Thang, “Adapta-
results obtained indicate that our proposed method is able to tion method for video streaming over HTTP/2,” IEICE Communications
reduce significantly the number of missed-deadline GOPs of ExPress., vol. 5, no. 3, pp. 69-73, 2016.
tiles covering the viewport and to balance the proportion of [20] D. V. Nguyen, H. T. Le, P. N. Nam, A. T. Pham, and T. C. Thang,
“Request adaptation for adaptive streaming over HTTP/2”, in Proc.
GOPs of tiles contained in the buffer. of the IEEE International Conference on Consumer Electronics (ICCE
2016), pp.189-191, Jan. 2016.
R EFERENCES [21] D. V. Nguyen, H. T. Le, P. N. Nam, A. T. Pham, T. C. Thang, “Adapta-
[1] marketsandmarkets.com. (2017),“Virtual Reality Market by Compo- tion method for video streaming over HTTP/2,” IEICE Communications
nent (Hardware and Software), Technology (Non-Immersive, Semi- ExPress., vol. 5, no. 3, pp. 69-73, 2016.
& Fully Immersive), Device Type (Head-Mounted Display, Ges- [22] V. Paxson, M. Alman, H.J. Chu, and M. Sargent, “Computing TCP’s
ture Control Device), Application and Geography - Global Fore- retransmission timer,” 2011.
cast to 2022”, Available at https://www.marketsandmarkets.com/ [23] C. Muller, S. Lederer, and C. Timmerer, “An evaluation of dynamic
Market-Reports/reality-applications-market-458.html. adaptive streaming over HTTP in vehicular environments,” in Proc.
[2] H. T.T. Tran et al., “A Subjective Study on QoE of 360 Video for VR ACM MMSys12, North Carolina, Feb. 2012.
Communication,” in Proc. IEEE MMSP2017.Luton, U.K, Oct. 2017. [24] Ricky K. P. Mok, Edmond W. W. Chan, and Rocky K. C. Chang,
[3] Visbit Inc., & Visbit Inc. (2016). “Virtual Reality (VR) and 360 Videos “Measuring the Quality of Experience of HTTP Video Streaming,” in
101 - A Beginner’s Guide”. Available at https://medium.com/visbit/ IFIP/IEEE International Symposium on Integrated Network Manage-
virtual-reality-vr-and-360-videos-101-a-beginners-guide-70bbade8e39. ment and Workshops, Dublin, Ireland, May. 2011.
[4] S. Hollister, “Youtubes ready to blow your mind with [25] Mariem Ben Yahia, Yannick Le Louedec, Loutfi Nuaymi and Gwendal
360-degree videos,” Available at https://gizmodo.com/ Simon, “When HTTP/2 Rescues DASH: Video Frame Multiplexing,” in
youtubes-ready-to-blow-your-mind-with-360-degree-videos-1690989402. IEEE Conference on Computer Communications Workshops, Atlanta,
[8] R. Huysegems, T. Bostoen, P. Rondao-Alface, J. van der Hooft, S. GA, USA, May. 2017.
Petrangeli, T. Wauters, and F. D. Turck, “HTTP/2-based methods to [26] nghttp2, https://github.com/nghttp2/nghttp2.
improve the live experience of adaptive streaming,” In ACM Multimedia
Conf. MM, 2015.