Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

An Innovative Approach for Privacy Preserving the Identified Sensitive Rules


Affiliations
1 Information Technology Department, Institute of Graduate Studies and Research (IGSR), Alexandria University, Egypt
2 Department of Information Technology, Institute of Graduate Studies & Research (IGSR), Alexandria University, Egypt
3 Center of Excellence, Arab Academy for Science, Technology, and Maritime Transport (AASTMT), Smart Village, Egypt
     

   Subscribe/Renew Journal


As data mining become more pervasive, privacy becomes one of the prime concerns in data mining research community. This, however, increases risks of disclosing the sensitive knowledge when the database is released to other peers and rivals. Therefore, privacy concerns force companies or competitors to be reluctant or unwilling to share their real data for collaboration to get mutual benefits. Many researchers proposed many methodologies to face the privacy issues in association rule mining by a sanitization process which transforms the source database into a released database called sanitized database to conceal sensitive rules. The sanitization process also conceals non sensitive information as an undesired event, called a side effect or the misses cost that affects data utility of the sanitized database. The challenge is to minimize the side effect on the sanitized database so that non sensitive knowledge can still be mined. This paper proposes an innovative approach to hide the sensitive frequent itemsets that may lead to the production of the selected identified sensitive rules "Statistically Significant Strongly Positive Correlation Rules"(SSSPCRH) by data owner. The proposed novel algorithm CIIEBE (Computing Impact of Inevitable and Evitable Border Elements) suggested three efficiently criterions to identify the relevant victim item, to select transaction(s) to be evaluated and relevant transaction(s) to be modified with minimal impact on positive border set rather than considering all non sensitive itemsets during the sanitization process. So that side effects can be fully avoided or accepting few side effects which will not harm data utility. The proposed algorithm suggests a set of metrics as well as new metrics to evaluate the effectiveness and efficiency; experimental results demonstrate that the proposed algorithm can achieve minimal side effects than those achieved by FHSFI, maxcover and spmaxFI algorithms in several real and artificial datasets.

Keywords

Data Mining, Privacy Preserving Data Mining, Sensitive Rules, Data Sanitization, Sensitive Itemset Hiding.
User
Subscription Login to verify subscription
Notifications
Font Size

Abstract Views: 331

PDF Views: 4




  • An Innovative Approach for Privacy Preserving the Identified Sensitive Rules

Abstract Views: 331  |  PDF Views: 4

Authors

Mahmoud M. Ismail
Information Technology Department, Institute of Graduate Studies and Research (IGSR), Alexandria University, Egypt
Shawkat K. Guirguis
Department of Information Technology, Institute of Graduate Studies & Research (IGSR), Alexandria University, Egypt
Mohamed M. Abo Rizka
Center of Excellence, Arab Academy for Science, Technology, and Maritime Transport (AASTMT), Smart Village, Egypt

Abstract


As data mining become more pervasive, privacy becomes one of the prime concerns in data mining research community. This, however, increases risks of disclosing the sensitive knowledge when the database is released to other peers and rivals. Therefore, privacy concerns force companies or competitors to be reluctant or unwilling to share their real data for collaboration to get mutual benefits. Many researchers proposed many methodologies to face the privacy issues in association rule mining by a sanitization process which transforms the source database into a released database called sanitized database to conceal sensitive rules. The sanitization process also conceals non sensitive information as an undesired event, called a side effect or the misses cost that affects data utility of the sanitized database. The challenge is to minimize the side effect on the sanitized database so that non sensitive knowledge can still be mined. This paper proposes an innovative approach to hide the sensitive frequent itemsets that may lead to the production of the selected identified sensitive rules "Statistically Significant Strongly Positive Correlation Rules"(SSSPCRH) by data owner. The proposed novel algorithm CIIEBE (Computing Impact of Inevitable and Evitable Border Elements) suggested three efficiently criterions to identify the relevant victim item, to select transaction(s) to be evaluated and relevant transaction(s) to be modified with minimal impact on positive border set rather than considering all non sensitive itemsets during the sanitization process. So that side effects can be fully avoided or accepting few side effects which will not harm data utility. The proposed algorithm suggests a set of metrics as well as new metrics to evaluate the effectiveness and efficiency; experimental results demonstrate that the proposed algorithm can achieve minimal side effects than those achieved by FHSFI, maxcover and spmaxFI algorithms in several real and artificial datasets.

Keywords


Data Mining, Privacy Preserving Data Mining, Sensitive Rules, Data Sanitization, Sensitive Itemset Hiding.