Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

An Estimation of Privacy in Incremental Data Mining


Affiliations
1 Department of Information Technology, Sathyabama University, Chennai, India
2 Department of Computer Science and Engineering, St. Joseph’s College of Engineering, Chennai, India
3 Tata Consultancy Services, Chennai, India
     

   Subscribe/Renew Journal


Data are values of qualitative or quantitative variables, belonging to a set of items. In recent years, advances in hardware technology have lead to an increase in the capability to store and record personal data about consumers and individuals. This has lead to concerns that the personal data may be misused for a variety of purposes. Data explains a business transaction, a medical record, bank details, educational details etc., Use of technology for data collection and analysis has seen an unprecedented growth in the last couple of decades. Such information includes private details, which the owner doesn’t want to disclose. Such data are the sources for data mining. Data mining gives us “facts” that are not obvious to human analysts of the data. When such sensitive data are given directly for mining, the security of the individual is highly affected. So the data are modified and presented for data mining. But the problem is that the altered data should also produce a similar mining result. This has lead an area called privacy preservation in datamining which is an intersection of data mining and information security. The fact in this area is the additional task which is used to implement the privacy degrades the performance of the data mining algorithm, which results in incorrect mining results. This crucial situation has led to the development of this paper which deals with the data metrics that  determines the quality of the following existing privacy preserving algorithms viz., Correlation- aware Anonymization of High-dimensional Data (CAHD) [1], Privacy-Preserving Outlier Detection Through Random Nonlinear Data Distortion (PRND) [2], Privacy-Preserving Data Aggregation(PPDA) [3], Privacy-Preserving Incremental Data sets( PRID) [4] which defines various methods for implementing privacy in incremental data. Major metrics like data utility, privacy and computational time are considered for evaluation and their detailed performance is discussed.

 


Keywords

Datamining, Privacy Preservation, Perturbation, Quality Metrics, Anonymization.
User
Subscription Login to verify subscription
Notifications
Font Size

Abstract Views: 249

PDF Views: 3




  • An Estimation of Privacy in Incremental Data Mining

Abstract Views: 249  |  PDF Views: 3

Authors

V. Rajalakshmi
Department of Information Technology, Sathyabama University, Chennai, India
G. S. Anandha Mala
Department of Computer Science and Engineering, St. Joseph’s College of Engineering, Chennai, India
R. Balasubramanian
Tata Consultancy Services, Chennai, India

Abstract


Data are values of qualitative or quantitative variables, belonging to a set of items. In recent years, advances in hardware technology have lead to an increase in the capability to store and record personal data about consumers and individuals. This has lead to concerns that the personal data may be misused for a variety of purposes. Data explains a business transaction, a medical record, bank details, educational details etc., Use of technology for data collection and analysis has seen an unprecedented growth in the last couple of decades. Such information includes private details, which the owner doesn’t want to disclose. Such data are the sources for data mining. Data mining gives us “facts” that are not obvious to human analysts of the data. When such sensitive data are given directly for mining, the security of the individual is highly affected. So the data are modified and presented for data mining. But the problem is that the altered data should also produce a similar mining result. This has lead an area called privacy preservation in datamining which is an intersection of data mining and information security. The fact in this area is the additional task which is used to implement the privacy degrades the performance of the data mining algorithm, which results in incorrect mining results. This crucial situation has led to the development of this paper which deals with the data metrics that  determines the quality of the following existing privacy preserving algorithms viz., Correlation- aware Anonymization of High-dimensional Data (CAHD) [1], Privacy-Preserving Outlier Detection Through Random Nonlinear Data Distortion (PRND) [2], Privacy-Preserving Data Aggregation(PPDA) [3], Privacy-Preserving Incremental Data sets( PRID) [4] which defines various methods for implementing privacy in incremental data. Major metrics like data utility, privacy and computational time are considered for evaluation and their detailed performance is discussed.

 


Keywords


Datamining, Privacy Preservation, Perturbation, Quality Metrics, Anonymization.