Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

The Cost of Privacy: Destruction of Data-Mining in Anonymized Data Publishing


Affiliations
1 GKM College of Engineering and Technology, Chennai, India
     

   Subscribe/Renew Journal


Search engines play a crucial role in the navigation through the vastness of the web. Today‟s search engines do not just collect and index web pages, they also collect and mine information about their users. They store the queries, clicks, IP-addresses, and other information about the interactions with users in what is called a search log. In this paper, we analyze algorithms for publishing frequent keywords, queries, and clicks of a search log. We first show how methods that achieve variants of k-anonymity are vulnerable to active attacks. We then propose an algorithm ZEALOUS and show how to set its parameters to achieve probabilistic privacy. We also contrast our analysis of ZEALOUS with an analysis by Korolova et al. [17] that achieves in-distinguishability. Our paper concludes with a large experimental study using real applications where we compare ZEALOUS and previous work that achieves k-anonymity in search log publishing. Our results show that zealous yields comparable utility to k-anonymity while at the same time achieving much stronger privacy guarantees and also can be applied more generally to the problem of publishing frequent items or item sets. A topic of future work is the development of algorithms to release useful information about infrequent keywords, queries, and clicks in a search log while preserving user privacy.

Keywords

Security, Integrity, and Protection, General, Database Management, Information Technology and Systems, Web Search, General, Information Storage and Retrieval, Information Technology and Systems.
User
Subscription Login to verify subscription
Notifications
Font Size

Abstract Views: 187

PDF Views: 3




  • The Cost of Privacy: Destruction of Data-Mining in Anonymized Data Publishing

Abstract Views: 187  |  PDF Views: 3

Authors

S. Priya
GKM College of Engineering and Technology, Chennai, India
A. Priya
GKM College of Engineering and Technology, Chennai, India
V. Divya
GKM College of Engineering and Technology, Chennai, India

Abstract


Search engines play a crucial role in the navigation through the vastness of the web. Today‟s search engines do not just collect and index web pages, they also collect and mine information about their users. They store the queries, clicks, IP-addresses, and other information about the interactions with users in what is called a search log. In this paper, we analyze algorithms for publishing frequent keywords, queries, and clicks of a search log. We first show how methods that achieve variants of k-anonymity are vulnerable to active attacks. We then propose an algorithm ZEALOUS and show how to set its parameters to achieve probabilistic privacy. We also contrast our analysis of ZEALOUS with an analysis by Korolova et al. [17] that achieves in-distinguishability. Our paper concludes with a large experimental study using real applications where we compare ZEALOUS and previous work that achieves k-anonymity in search log publishing. Our results show that zealous yields comparable utility to k-anonymity while at the same time achieving much stronger privacy guarantees and also can be applied more generally to the problem of publishing frequent items or item sets. A topic of future work is the development of algorithms to release useful information about infrequent keywords, queries, and clicks in a search log while preserving user privacy.

Keywords


Security, Integrity, and Protection, General, Database Management, Information Technology and Systems, Web Search, General, Information Storage and Retrieval, Information Technology and Systems.