You are on page 1of 5

CPE 556 Term paper

Chenxi Yang

cyang13@stevens.edu

10387658

Danger without notice: Passive content leaks and pollution


in Android applications
Chenxi Yang
Department of Electrical and Computer Engineering
Stevens Institute of Technology

1. Introduction
Android smart phones are becoming popular with a booming number in both users and available
apps. According to a report from Gartner, Android smartphone has already been the first place in
smartphone market, taking more than 50% of market shares.[1] Because of the open-source
developer environment and multiple app markets beside the official Google Play, there are more
than 1 million apps available by July 2013 for users to download and install. [5]
Due to the fact that people use apps to manipulate their private information and system security
settings, researchers and developers have been focusing on protecting private information and
Android system ever since the existence of Android apps. Most of the malicious apps or in-app
functions studied before are in an active mode of information leaks and pollution, which means
that these apps try to steal or leak users private information and change system settings in
purpose. However, Yajin Zhou and Xuxian Jiang, two researchers from the CS department of
North Carolina State University, have studied a different type of danger, showing that passive
content leaks and pollution are also a significant problem that we should deal with [1].
In their paper, they provide a systematic study of two types of vulnerabilities. The first
vulnerability is called passive content leak and the second one is called content pollution. One
biggest difference between these two vulnerabilities is that data and information could be
changed or modified in the second vulnerability while the first one only exposure private
information without manipulating behaviors. On the other hand, these two problems do share a
same cause of vulnerability. They all happen on the same part of Android component, i.e. content
provider. Content provider is the standard interface in Android programming that connects data in
one process with code running in another process [2]. It manages access to a structured set of data,
encapsulates the data, and provides mechanisms for defining data security. Though by definition
this component owns the blueprint of data security, it is accessible by all kind of apps if using
default security settings. This is also the main reason why this vulnerability is so dangerous and
widely affecting a huge number of apps. Zhou and Jiangs research indicates that more than 2%
of 62519 apps they collected in 2012 suffer from passive content leak and 1.4% of these apps are

CPE 556 Term paper

Chenxi Yang

cyang13@stevens.edu

10387658

affected by content pollution. Popular mobile web browsers, instant messengers and social
network apps are high on the list. Even some security apps have these problems.

2. Research method
In order to analyze vulnerable apps productively and systematically, Zhou and Jiang designed a
tool called ContentScope. This tool checks a given app to see if this app exposes the public
content provider interfaces (denoted as start functions). If yes, the ContentScope will then locate
the Android functions or routines to find out which one is actually operating the internal
data(denoted as terminal functions). After this procedure, this tool will be able to generate a pathe
sensitive data-flow analysis in order to automatically derive constraints and inputs to evaluate the
provided app. The input will be fed into an app operating on a real smartphone to confirm the
vulnerability.
This system design is based on a threat model. This model describes a situation that a malicious
app is installed in a users smartphone as well as the given vulnerable app. In particular, it
assumes that the malicious app does not request any dangerous permission when launch attack.
Furthermore, this malicious app also need internet permission to transport stolen data or leaked
information to a remote place. Under this model, a tool with 3 basic steps is designed. According
to Zhous paper, their ContentScope has an architecture shown in figure 1.

Figure 1 System Architecture[4]

The first step of ContentScope is candidate app selection. Obviously the examined apps should be
the apps that have the possibility to be vulnerable, which in this case means that the selected app
should have an exploitable content provider interface. Generally, the component of content
provider in Android programming requires developers to define a content provider in manifest
files. In addition, this component does have some protection schematic. For example, the
developer can either set the property named exported to false in manifest file or define custom
permissions in a protection level at which app data can only be access by granted apps(under the
level of normal, the app will not be protected). To effectively locate candidate apps, the
ContentScope first parses those apps manifest files to determine whether there is any content

CPE 556 Term paper

Chenxi Yang

cyang13@stevens.edu

10387658

provider defined. It yes, two corresponding attributes will be extracted. The first one is the
exported property and the second one is to detect existence of any custom permission that
controls the read and write to content provider. In conclusion, the system will select the apps that
define a content provider component which has an exported property set to true or by default true.
The system will also select the apps which have permission levels at the normal protection level.
After selecting candidate apps, the next step is to check if these apps really have the
vulnerabilities. This step is called vulnerable app determination. Determination of a given app
consists of 3 small steps. Firstly, the ContentScope need to identify possible paths (AKA
execution paths) from start functions to different types of terminal functions. After this step, the
ContentScope will know how the given app can be triggered with public start functions and
finally calls the internal functions which manage private information. Then ContentScope will try
to generate appropriate inputs to see the vulnerability of particular execution path. To realize
this, a control flow graph (CFG) for each function on the path will be generated. A solver will
come up with all the possible constraints and generate the correct input to reach the execution
path. Since there is a possibility that the generated inputs are not able to show the vulnerability of
a specific path, Zhou and Jiang confirms the vulnerability by testing the generated input via real
invoking content provider interfaces on a real smartphone.
Last step of the ContentScope system is to classify the possible content leak or pollution. In this
step, the system uses the former function call graph and control flow graph. It identifies the type
of leak/pollution by detecting the type of content being saved in the content provider before the
leak/pollution happens. Also, the system evaluates possible side effects by looking at the
execution path to see if the content provider has connected to dangerous functions. For instance,
if a content provider stores contact information, and it has an execution path from query()
function to an Android API that blocks SMS messages. The system may consider that this app
will leak user contacts and there is a side effect that some contacts message might be blocked.
This step is also verified by manually checking to avoid false positives.

3. Implementation and results


According to Zhou and Jiangs work, a ContentScope prototype is built with Python and Java
codes. More in detail, the first two steps, which are candidate app selection and vulnerable app
determination, are developed using Python and the vulnerable apps classification is written via
Java codes. In passive content leak part, detection of start functions are focused on two different
types of content provider functions: one is ContentProvider.query() and the other is
ContentProvider.openFile() which provide support to SQLite databases and local file data
accordingly. As a consequence, possible terminal functions will be limited to 3 SQLite functions
and one local directory functions. Similarly, to detect content pollution, insert() and update() of
ContentProvider are being treated as possible start functions and their corresponding SQLite

CPE 556 Term paper

Chenxi Yang

cyang13@stevens.edu

10387658

database operation functions are possible terminal functions. But there is a big difference from
passive leak case. The system need to come up with format-correct data to inject into private
database. In order to get correct table scheme of the database, ContentScope check SQL creating
statements to find answer. In addition, ten different trials of injection will be executed to
minimize the chance of value conflict that may cause the injection to fail.
As a result, 1279 of 62519 apps have the vulnerability of passive content leak. The summary of
different categories of leak is shown in table1. This table reveals that many popular apps (more
than 500000 installs) have a chance to leak private information passively, which means that the
influence of users at present is in a huge number. The whole collection of different type of
information leak also indicates that this vulnerability issue is a significant problem that needs to
be fixed quickly.
Table 1 Summary of Passive content leak types
Category
Number of Apps
Representatives
SMS messages
268
Pansi SMS
Contacts
128
mOffice-Outlook
Sync
Private info in IM 121
Messenger with you
apps
User credentials
80
GO FB Widget
Browser History
70
Dolphin Browser HD
Call logs
61
Droid Call Filter
Private info in social 27
Sina Weibo
network apps

Number of installs
500000-1000000
100000-500000
10000000-50000000
1000000-5000000
10000000-50000000
100000-500000
100000-500000

On the content pollution part, the prototype system detected 871 apps which are endangered. To
these researchers surprise, several popular security apps are vulnerable to content pollution.
Attacker can easily block SMS messages and phone calls by injecting information into blacklist
files of these security apps. There is also a huge vulnerability that might allow the app to
download back-ground apps without any notice.

4. Discussion and comment


In October 2012, Google released Android 4.2 with API level 17. In this level, default exported
property of content provider are set to false instead of true. This method reduces the risk of
passive leak attacks but lower targeting apps still have the vulnerability. Another question which
remains to be answered is that whether apps with custom permission protection level set to
normal still have the vulnerability after Android 4.2. If yes, the platform provider could take some
action to solve this problem other than developers. Normal level of protection could be limited to

CPE 556 Term paper

Chenxi Yang

cyang13@stevens.edu

10387658

a less danger range of permissions. Or the categories of protection levels should be re-designed to
meet the need of developers. Maybe the developers think that dangerous level of protection is too
high and they use normal level with the consideration of users experience and feeling. Though
Zhou and Jiang think that both the platform provider and developer have responsibility to solve
this problem, I still believe that developers are strictly limited to the given choices from platform.
Security application notes should be sent from platform provider to developers to raise their
attention. Official Market and third-party markets should also enact some security testing
procedure to improve safety levels. Unlike Android, IOS only supports official app store and this
app store has a vetting system to regulate apps with strict rules, though with some minor leak, this
system has a better performance than any android markets.[3]
Nowadays, many applications share a same account on different terminals and many of them can
be operated at the same time. For example, social network app Sina weibo can be operated from
desktop client, desktop web-browser, cell phone web-browser, cell phone apps and tablet apps.
Therefore, the safety of a multi-terminal system now relies on the weakest part of the terminals.
Maybe the desktop client are designed to be perfect, one minor bug on cell phone app might
destroy the whole system. This requires developers and engineers to be more careful when trying
to expand more platforms to their designs.
Another point of view I want to mention is that default settings of a system should always be set
to the setting with the best safety level or in other words the lowest energy level. Just like the
water taps are by default off instead of on, most systems designed to be robust should take safe
default setting into consideration. Also, default output without a certain input should be carefully
designed to achieve better performance and stability.

References
[1] Zhou, Yajin, and Xuxian Jiang. "Detecting passive content leaks and pollution in android
applications." Proceedings of the 20th Annual Symposium on Network and Distributed System
Security. 2013.
[2] Content Providers http://developer.android.com/guide/topics/providers/content-providers.html
[3] Egele, Manuel, et al. "PiOS: Detecting Privacy Leaks in iOS Applications." NDSS. 2011.
[4] Zhou, Yajin, Detecting Passive Content Leaks and Pollution in Android Applications,
conference slides. 2013 http://yajin.org/papers/ndss13_contentscope_slides.pdf
[5] Google Play http://en.wikipedia.org/wiki/Google_play

You might also like