Open Access
Subscription Access
Open Access
Subscription Access
The Cost of Privacy: Destruction of Data-Mining in Anonymized Data Publishing
Subscribe/Renew Journal
Search engines play a crucial role in the navigation through the vastness of the web. Today‟s search engines do not just collect and index web pages, they also collect and mine information about their users. They store the queries, clicks, IP-addresses, and other information about the interactions with users in what is called a search log. In this paper, we analyze algorithms for publishing frequent keywords, queries, and clicks of a search log. We first show how methods that achieve variants of k-anonymity are vulnerable to active attacks. We then propose an algorithm ZEALOUS and show how to set its parameters to achieve probabilistic privacy. We also contrast our analysis of ZEALOUS with an analysis by Korolova et al. [17] that achieves in-distinguishability. Our paper concludes with a large experimental study using real applications where we compare ZEALOUS and previous work that achieves k-anonymity in search log publishing. Our results show that zealous yields comparable utility to k-anonymity while at the same time achieving much stronger privacy guarantees and also can be applied more generally to the problem of publishing frequent items or item sets. A topic of future work is the development of algorithms to release useful information about infrequent keywords, queries, and clicks in a search log while preserving user privacy.
Keywords
Security, Integrity, and Protection, General, Database Management, Information Technology and Systems, Web Search, General, Information Storage and Retrieval, Information Technology and Systems.
User
Subscription
Login to verify subscription
Font Size
Information
Abstract Views: 188
PDF Views: 3