Open Access Open Access  Restricted Access Subscription Access

A Novel Association Rule Hiding Approach in OLAP Data Cubes


Affiliations
1 Computer Engineering Department,Najafabad branch, Islamic Azad University, Esfahan, Iran, Islamic Republic of
 

Data mining services require exact input data for their outcomes to be significant, but privacy concerns may influence users to provide fake information. We study here, with respect to mining association rules, whether or not users can be confident to provide correct information by ensuring that the mining process cannot, with any reasonable degree of certainty, breach their privacy. A data warehouse stores current and historical records consolidated from multiple transactional systems. Protecting data warehouses is of rising interest, particularly in view of areas where data are sold in pieces to third parties for data mining studies. In this case, current normal data warehouse security techniques, like data access control, may not be easy to impose and can be in effective. As an alternative, this paper proposes a data perturbation based approach, to provide privacy preserving in association rule mining on data cubes in a data warehouse. In order to conceal association rules and save the utility of transactions in data cubes, we select Genetic Algorithm to find optimum state of modification. In our approach various hiding styles are applied in different multi-objective fitness functions. To cope with the multi-objective functions, Pareto-front ranking strategy has been applied for obtaining the non-dominated solutions front. First objective of these functions is hiding sensitive rules and the second one is keeping the accuracy of transactions in data cube. After sanitization process we test the sanitization performance by evaluation of various criterions. The major feature is that the proposed strategy does not affect the functionality of the On-Line Analytical Processing system. Finally our experimental results show its effectiveness and feasibility.

Keywords

OLAP, Data Cube, Data Mining, Association Rule Hiding
User

  • Chaudhuri S, Dayal U (1997) An Overview of Data Warehousing and OLAP Technology. Sigmod Record.
  • Rizvi S J and Haritsa J R (2002) Maintaining Data Privacy in Association Rule Mining. In proceedings 28th VLDB Conference, Hong Kong, China.
  • Verykios V, Elmagarmid A and Bertino E (2004) Association rule hiding. IEEE Transactions on Knowledge and Data Engineering, 16(4):434–447.
  • Clifton C and Marks D (1996) Security and privacy implications of data mining. SIGMOD ’96: Proceedings of the 2000 ACM IGMOD International Conference on Management of Data, pages 15–20.
  • Oliveira S and Zaiane O (2002) Privacy preserving frequent itemset mining. RPITS’14: Proceedings of the IEEE International Conference on Privacy, Security, and DataMining, pages 43–54.
  • Sun X and Yu P S (2005) A border-based approach for hiding sensitive frequent itemsets. ICDM ’05: Proceedings of the 5th IEEE International Conference on Data Mining, pages 426-433.
  • Atallah M, Bertino E (1999) A. Elmagarmid,M. Ibrahim and V. Verykios. Disclosure limitation of sensitive rules. Proc. of IEEE Knowledge and Data Engineering Exchange Workshop (KDEX).
  • Verykios V, Elmagarmid A, Bertino E, Saygin Y and Dasseni E (2004) Association Rule Hiding. IEEE Trans. on Knowledge and Data Engineering, 16(4).
  • Oliveira S and Zaiane O (2002) Privacy preserving frequent itemset mining. CRPITS’14: Proceedings of the IEEE International Conference on Privacy, Security, and Data Mining, pages 43–54.
  • David L (1991) Handbook of Genetic Algorithms. New York : Van Nostrand Reinhold.
  • Goldberg D E (1989) Genetic Algorithms: in Search, Optimization, and Machine Learning. New York : Addison-Wesley Publishing Co. Inc.
  • Goldberg D, Karp B, Ke Y, Nath S, and Seshan S (1989) Genetic algorithms in search, optimization, and machine learning. Addison-Wesley.
  • Kim I Y and Weck O L de (2005) Adaptive weightedsum method for bi-objective optimization: Pareto front generation. Struct Multidisc Optim. 29, 149–158, Springer.
  • Amiri (2007) Dare to share: Protecting sensitive knowledge with data sanitization. Decision Support Systems, 43(1):181–191.
  • Wang K, Fung B C M, and Yu P S (2005) Templatebased privacy preservation in classification problems. In Proceedings of the Fifth IEEE International Conference on Data Mining (ICDM 2005), pages 466–473.
  • Wang S L and Jafari A (2005) Using unknowns for hiding sensitive predictive association rules. In Proceedings of the 2005 IEEE International Conference on Information Reuse and Integration (IRI 2005), pages 223–228.
  • Wu X, Wu Y, Wang Y, and Li Y (2005) Privacy aware market basket data set generation: A feasible approach for inverse frequent set mining. In Proceedings of the 2005 SIAM International Conference on Data Mining (SDM 2005).
  • Wu Y H, Chiang C M, and Chen A L P (2007) Hiding sensitive association rules with limited side effects. IEEE Transactions on Knowledge and Data Engineering, 19(1):29–42.
  • Abul O, Atzori M, Bonchi F, and Giannotti F (2006) Hiding sequences. Technical report, Pisa KDD Laboratory, ISTI-CNR, Area della Ricerca di Pisa.
  • Gkoulalas-Divanis and Verykios V (2006) An integer programming approach for frequent itemset hiding. In Proceedings of the 2006 ACM Conference on Information and Knowledge Management (CIKM 2006), pages 748–757.
  • Inan and Saygin Y (2006) Privacy preserving spatiotemporal clustering on horizontally partitioned data. In Proceedings of the 8th International Conference on Data Warehousing and Knowledge Discovery (DaWaK 2006), pages 459–468.
  • Jagannathan G, Pillaipakkamnatt K, and Wright R N (2006) A new privacy preserving distributed k-clustering algorithm. In Proceedings of the 2006 SIAM International Conference on Data Mining (SDM 2006), 2006.
  • Wang L, Wijesekera D (2002) Cardinality-based Inference Control in Sum-only Data Cubes. Proc. of the 7th European Symp. on Research in Computer Security.
  • Wang L, Li Y, Wijesekera D and Jajodia S (2003) Precisely Answering Multi-dimensional Range Queries without Privacy Breaches. ESORICS 2003, pages 100-115.
  • Wang L, Jajodia S and Wijesekera D (2004) Securing OLAP data cubes against privacy breaches. Proc. IEEE Symp. on Security and Privacy, pages 161-175.

Abstract Views: 557

PDF Views: 174




  • A Novel Association Rule Hiding Approach in OLAP Data Cubes

Abstract Views: 557  |  PDF Views: 174

Authors

Mohammad Naderi Dehkordi
Computer Engineering Department,Najafabad branch, Islamic Azad University, Esfahan, Iran, Islamic Republic of

Abstract


Data mining services require exact input data for their outcomes to be significant, but privacy concerns may influence users to provide fake information. We study here, with respect to mining association rules, whether or not users can be confident to provide correct information by ensuring that the mining process cannot, with any reasonable degree of certainty, breach their privacy. A data warehouse stores current and historical records consolidated from multiple transactional systems. Protecting data warehouses is of rising interest, particularly in view of areas where data are sold in pieces to third parties for data mining studies. In this case, current normal data warehouse security techniques, like data access control, may not be easy to impose and can be in effective. As an alternative, this paper proposes a data perturbation based approach, to provide privacy preserving in association rule mining on data cubes in a data warehouse. In order to conceal association rules and save the utility of transactions in data cubes, we select Genetic Algorithm to find optimum state of modification. In our approach various hiding styles are applied in different multi-objective fitness functions. To cope with the multi-objective functions, Pareto-front ranking strategy has been applied for obtaining the non-dominated solutions front. First objective of these functions is hiding sensitive rules and the second one is keeping the accuracy of transactions in data cube. After sanitization process we test the sanitization performance by evaluation of various criterions. The major feature is that the proposed strategy does not affect the functionality of the On-Line Analytical Processing system. Finally our experimental results show its effectiveness and feasibility.

Keywords


OLAP, Data Cube, Data Mining, Association Rule Hiding

References





DOI: https://doi.org/10.17485/ijst%2F2013%2Fv6i2%2F30587