Professional Documents
Culture Documents
LEVERAGE INTERACTIONS BETWEEN THE CPU AND NETWORK TO DELIVER HIGH MOBILE
Energy (J)
30 1.8
Load time (s)
Power (W)
20
20 1.2
10
10 0.6
0 0 0.0
2009
2010
2011
2012
2013
2014
2009
2010
2011
2012
2013
2014
Year Year
(a) (b)
Figure 1. Average performance, power, and energy consumption trends from 2009 to 2014
for eight of the top websites, ranked according to Alexa. We pick one of the best-performing
smartphones from each year to load all of the eight webpages’ archived images from Internet
Archive. (a) Performance and energy trend. (b) Power trend.
affect the energy efficiency of mobile Web interaction-based approach to adjust the
browsing: the CPU and the network. Con- CPU’s voltage and frequency setting accord-
ventional wisdom suggests that network ing to the operating network latency. Our
capability is the primary bottleneck in mobile findings suggest that designing future high-
Web browsing. A great portion of the performance mobile Web browsers will
research has focused on improving the net- require us to look beyond optimizing indi-
work,1-3 but little work has been done on vidual resource components and take an inte-
understanding how the CPU impacts mobile grated full-system perspective.
Web browsing.
We find that the CPU’s impact on
mobile Web browsing energy efficiency Considering the past and present
strongly depends on the network latency. Our study takes a historical perspective to
Under low network latency, such as in an understand Web performance trends, bottle-
LTE cellular network, the best system energy necks, and energy implications. Here, we
efficiency is achieved under high CPU per- introduce our investigative methodology for
formance, which leads to significantly faster conducting such a study with representative
webpage load time while simultaneously smartphone and Web technology snapshots.
conserving energy. In contrast, under an
adverse connectivity with high network
latency (for example, 3G), the webpage load Smartphones
time is insensitive to the CPU performance. To study evolving CPU performance, we
Low CPU performance makes the system selected six top smartphones, each represent-
energy efficient by significantly saving energy ing cutting-edge smartphone technologies
without degrading Web browsing perfor- for each year from 2009 to 2014. These are
mance. Motorola’s Droid from 2009, and Samsung’s
From these observations, we quantita- Galaxy S, Nexus, and S3, S4, and S5 from
tively demonstrate that existing dynamic 2010 through 2014, respectively. Each
voltage and frequency scaling (DVFS) gover- smartphone houses a CPU that differs in
nors in the OS are inadequate in deciding the microarchitectural design, peak clock fre-
CPU performance, because they do not con- quency, and number of cores. Chronologi-
sider the network latency, and therefore often cally, the six phones reflect how CPU
lead to suboptimal energy-efficiency operat- processing capabilities have progressed over
ing points. We describe a CPU-network time. We used the Monsoon power monitor
.............................................................
JANUARY/FEBRUARY 2015 27
..............................................................................................................................................................................................
MOBILE SYSTEMS
to measure the six smartphones’ battery-level implications of complex webpages and not
energy consumption. just simple mobile webpages.
28 IEEE MICRO
and hold the CPU performance at its highest
frequency while studying the network, to iso- 38
late each component’s effect.
We first focus on the network. Figure 2 32
shows the webpage load time with respect to
JANUARY/FEBRUARY 2015 29
..............................................................................................................................................................................................
MOBILE SYSTEMS
20
2010 - Galaxy S Instead, our data leads us to conclude that
2009 - Motorola Droid
16 under low network-latency conditions, the
CPU plays a vital role in mobile Web brows-
12 ing performance. When the network deviates
from an ideal low-latency network condition,
8
high-latency network access is prevalent, and
4 under such circumstances mobile Web per-
0.0 0.4 0.8 1.2 1.6 2.0 2.4 formance is indeed constrained by network
CPU frequency (GHz) performance.
400 ms
the latency variation issue even further. Here,
1000 ms
3 we show that the network latency can strongly
2000 ms
affect the mobile CPU’s impact on the Web
2 browsing performance. Such an interaction
between CPU and network casts strong
1 energy-efficiency implications.
We use Figure 4 to show how network
0 latency affects the CPU’s impact on webpage
0.4 0.8 1.2 1.6 2.0 2.4
load time. We continue to use the Galaxy S5
CPU frequency (GHz) with Wi-Fi for our experiments and man-
ually inject delay into the Web server, as we
Figure 4. Average webpage load times with respect to CPU frequencies, explained earlier. Each line represents a par-
operating at different network RTTs. Each line corresponds to a unique RTT ticular network latency, ranging from 10 to
value, and the markers correspond to different CPU frequencies. The 2,000 ms. The y-axis shows the webpage load
webpage load time at different frequencies under a particular RTT is time normalized to the best performance
normalized to the lowest load time under that RTT. (that is, using the highest frequency) of each
particular network latency.
When we compare the different RTT
Figure 3. Comparing the peak performance trend lines, at higher network latency, the
of the Motorola Droid from 2009 and the S5 corresponding line has a sharper curve, indi-
from 2014, six years of CPU innovation can cating that the CPU performance has a larger
decrease the webpage load time by 6, from impact on webpage load time. For example,
more than 24 seconds to 4 seconds. at an adverse 3G network where the latency
The takeaway from our observations is is about 1,000 ms, loading the webpage at
that the continuous improvement to network the lowest frequency (0.4 GHz) leads to only
............................................................
30 IEEE MICRO
100 20
Performance
governor
80 15
Energy (J)
Energy (J)
60 10
Ondemand
40 5 governor
Performance
Ondemand governor
governor
20 0
30 33 36 39 42 45 0 5 10 15 20
Load time (s) Load time (s)
(a) (b)
Figure 5. Load time versus energy consumption at different frequencies (empty markers),
operating under two network RTTs: (a) 2,000 ms and (b) 100 ms. At each RTT, the CPU
frequency decreases from 2.5 to 0.4 GHz from left to right. We also show the load time and
energy consumption of two OS governors: performance and ondemand.
a less than 2 slowdown as compared to run- traditional OS DVFS governors that cur-
ning the highest performance (2.5 GHz). rently do not leverage the interplay between
In contrast, under a 100-ms latency, typi- the CPU and network.
cal for LTE, the lowest frequency leads to
almost 4 slowdown in performance. On
the other hand, as network latency increases, Energy-efficiency optimization opportunities
the corresponding curve becomes flatter, Because the CPU’s impact on Web brows-
indicating that webpage load time becomes ing performance varies with different net-
less sensitive to CPU performance. For work latencies, studying the CPU’s energy
instance, Figure 4 shows that CPU perform- consumption impact also requires us to con-
ance has little impact on webpage load time sider various network latencies. For simplicity
at a 2-second network latency. without loss of generality, we consider two
network scenarios: high network latency with
RTT of 2,000 ms, such as with adverse 3G
The Mobile CPU’s energy-efficiency connectivity; and low network latency with
implications RTTof 100 ms, such as with good LTE cellu-
We study how the CPU impacts the lar connectivity. Figures 5a and 5b show the
energy consumption of mobile Web brows- total device energy consumption against vari-
ing, and how such an impact varies with dif- ous CPU frequencies under the two network
ferent network latencies. Such interactions latencies, respectively. Within each network
between the CPU and network shed light on latency, each empty marker represents a
energy-efficiency optimization opportunities. CPU frequency that decreases from 2.5 to
On the basis of these observations, we 0.4 GHz from left to right.
describe an approach to coordinate mobile Under adverse network connectivity (such
CPU performance with network latency to as 2,000 ms latency), webpage load time is
optimize Web browsing energy efficiency extremely insensitive to CPU performance
(that is, saving energy without compromising (see Figure 4). Therefore, there is an opportu-
performance). Our analysis, based on oracle nity to exploit slack, meaning that during
knowledge, suggests that such an approach slow network performance there is no need
can provide better energy efficiency over for high CPU performance. We can lower
.............................................................
JANUARY/FEBRUARY 2015 31
..............................................................................................................................................................................................
MOBILE SYSTEMS
the CPU’s operating frequency, and conse- performance governor by default. The
quently reduce the voltage, and thus reduce ondemand governor attempts to save energy
device power consumption without causing by changing the CPU frequency according to
any noticeable performance impact. Figure CPU use. This governor is commonly adopted
5a shows that using lower frequencies causes for conserving energy.
little performance degradation compared to Under high network latency, Figure 5a
the highest frequency, while achieving at shows that the performance governor
most 70 percent energy savings. biases toward high CPU performance, con-
Under a well-behaved network (such as a suming excessive energy without providing
100-ms network latency), however, the web- much performance gain. Under low network
page load time is sensitive to CPU perform- latency, Figure 5b shows that the ondemand
ance, as Figure 4 shows. As a result, Figure 5b governor biases toward lower frequencies,
shows that using the highest frequency loads and therefore loads webpages 2 slower than
the webpages about 4 faster and consumes the highest frequency while consuming
more than 20 percent less energy than using slightly higher energy.
the lowest frequency. We see the feasibility of new heuristics for
Our results indicate that the mobile CPU future governor designs. For example, it is
casts two implications on Web browsing possible to design a hierarchical governor
energy efficiency. First, under high network that first detects the network latency (RTT)
latency, lowering CPU performance is more when the first network request comes back,
energy efficient for the overall system, which typically takes less than 5 percent of
because lower CPU performance can achieve the webpage load time, and selects the
significant energy savings with little perform- performance governor if the RTT is
ance degradation. Second, under low net- below a certain threshold, or the ondemand
work latency, provisioning high CPU governor if the RTT is above the threshold. A
performance is more energy efficient for Web more complex approach is to train offline
browsing because it avoids excessively long models that predict the webpage load time
webpage load time and therefore consumes and energy consumption under various net-
less energy. work latencies,11 and at runtime select the
ideal frequency that minimizes the energy
Current OS governors consumption without exceeding a certain
Current mobile operating systems do not load time threshold.
consider network connectivity, and therefore
do not always choose the ideal CPU perform-
ance for energy management. We measure
the webpage load time and energy consump-
tion of two OS DVFS governors commonly
A lthough this article focuses on coordi-
nating CPU processing capability with
network performance, the general takeaway
adopted in Android, performance and is that similar coordinated optimization
ondemand, taken from the Android CPU- opportunities can exist that extend beyond
Freq Governor (http://goo.gl/8pkyri), and the CPU and include other major mobile
show the opportunity to improve them. components as well. Mobile systems on chips
Comprehensively evaluating a new OS (SoCs) offer several other computation
DVFS governor is beyond the scope of this resources that can be adapted to network
article, especially because our focus is to char- connectivity. For instance, many heterogene-
acterize the CPU’s role in enabling energy- ous SoCs (such as Samsung’s Exynos Octa)
efficient mobile Web browsing. contain little cores that have lower perform-
The solid markers in Figures 5a and 5b ance than main cores but consume less
show the measured results of the two governors energy. Loading webpages on the little core11
under the network latency of 2,000 ms and in observance of high network latency would
100 ms, respectively. The performance likely achieve an even lower energy consump-
governor statically sets the frequency to the tion compared to simply throttling the main
highest, and therefore guarantees applica- core’s operating frequency, which is what we
tion interactivity. The Galaxy S5 uses the demonstrate in this article.
............................................................
32 IEEE MICRO
Moreover, contrary to adapting the CPU’s 10. J. Bixby, “How I Learned that 3G Mobile
performance to network latency, the network Performance is Up to 10 Slower than
latency could also be adapted to the CPU’s Throttled Broadband Service,” Web Per-
performance. Web performance is less sensi- formance Today, 26 Oct. 2011; http://goo
tive to network latency on a slow CPU, and .gl/XuO8Of.
therefore under circumstances where the 11. Y. Zhu and V.J. Reddi, “High-Performance
CPU is running at low performance, or sim- and Energy-Efficient Mobile Web Browsing
ply on a low-end feature phone, it is possible on Big/Little Systems,” Proc. IEEE 19th Int’l
to throttle the network latency, or even switch Symp. High Performance Computer Archi-
to a lower-generation cellular network con- tecture (HPCA 13), 2013, pp. 13–24.
nection to save energy.2
MICRO Yuhao Zhu is a PhD student in the Depart-
ment of Electrical and Computer Engineer-
.................................................................... ing at the University of Texas at Austin. His
References research focuses on improving the energy
1. I. Grigorik, High Performance Browser Net- efficiency of mobile Web computing holisti-
working, O’Reilly, 2013. cally, spanning processor architecture, run-
2. J. Huang et al., “A Close Examination of time systems, and language extensions. Zhu
Performance and Power Characteristics of has a BS in computer science and engineer-
4G LTE Networks,” Proc. 10th Int’l Conf. ing from Beihang University. Contact him
Mobile Systems, Applications, and Services at yzhu@utexas.edu.
(MobiSys 12), 2012, pp. 225–238.
3. J. Huang et al., “An In-Depth Study of LTE: Matthew Halpern is a PhD student in the
Effect of Network Protocol and Application Department of Electrical and Computer
Behavior on Performance,” Proc. ACM SIG- Engineering at the University of Texas at
COMM, 2013, doi:10.1145/2486001.2486006. Austin. His research interests include run-
time systems and computer architecture for
4. Z. Wang et al., “Why Are Web Browsers
mobile computing. Halpern has a BS in
Slow on Smartphones?” Proc. 12th Work-
electrical and computer engineering from
shop Mobile Computing Systems and Appli-
the University of Texas at Austin. Contact
cations, 2011, pp. 91–96.
him at matthalp@utexas.edu.
5. Y. Zhu and V.J. Reddi, “WebCore: Architec-
tural Support for Mobile Web Browsing,” Vijay Janapa Reddi is an assistant professor
Proc. 41st Ann. Int’l Symp. Computer Archi- in the Department of Electrical and Com-
tecture (ISCA 14), 2014, pp. 541–552. puter Engineering at the University of Texas
6. X.S. Wang et al., “Demystifying Page Load at Austin. His research interests include pro-
Performance with WProf,” Proc. 10th USE- cessor architecture and system software
NIX Conf. Networked Systems Design and design and implementation for addressing
Implementation (NSDI 13), 2013, pp. performance, power, thermal, energy, and
473–486. reliability issues for mobile and high-
7. “3G/4G Wireless Network Latency: How performance computing. Janapa Reddi has a
Do Verizon, AT&T, Sprint and T-Mobile PhD in computer science from Harvard
Compare?” Fierce Wireless, 5 Nov. 2013; University. Contact him at vj@ece.utexas
http://goo.gl/L9ieoy. .edu.
8. V. Agarwal et al., “Clock Rate Versus IPC:
The End of the Road for Conventional Micro-
architectures,” Proc. 27th Ann. Int’l Symp.
Computer Architecture (ISCA 00), 2000, pp.
248–259.
9. W3C, Improving Web Performance on
Mobile Browsers, Nov. 2012; www.w3.org
/2012/11/webperf-slides-greenstein.pdf.
.............................................................
JANUARY/FEBRUARY 2015 33