You are on page 1of 2

PRIVACY PROTECTION IN DATA MINING

ABSTRACT

In todays digital age, online services such as search engines and social
networking websites collect large amounts of data about their users and their
users online activities. Large-scale mining and sharing of this data has been a
key driver of innovation and improvement in the quality of these services, but
has also raised major user privacy concerns.

This thesis aims to help companies protecting their users privacy apart
from finding ways to mine and share user data for the purpose of innovation. To
achieve this we explore examples of privacy violations, propose privacypreserving algorithms, and analyze the trade-offs between utility and privacy for
several concrete algorithmic problems in search engine and social networking
domains.

We propose and execute two novel privacy attacks on an advertising


system of a social networking site that lead to breaches of user privacy. The
attacks take advantage of the advertising systems use of users private profile
data in order to infer private information about users. The proposed attacks
build a case for a need to reason about data sharing and mining practices,

elucidate the privacy and utility tradeoffs that may arise in advertising systems
that allow fine-grained targeting based on user profile and activity
characteristics and have contributed to changes in the social networks
advertising system aimed at increasing the barriers to practical execution of
such attacks in the future.

We propose a practical algorithm for sharing a subset of user search data


consisting of queries and clicks in a provably privacy-preserving manner. The
algorithm protects privacy by limiting the amount of each users data used and
non-deterministically, throwing away infrequent elements in the data, with the
specific parameters of the algorithm being determined by the privacy guarantees
desired. The proposed algorithm and the insights gained from its analysis offer a
systematic and practical approach towards sharing counts of user actions while
satisfying a rigorous privacy definition, and can be applied to improve privacy
in applications that rely on mining and sharing user search data. This thesis aims
in providing tools for achieving a viable balance between user data driven
innovation and privacy.

You might also like