Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

An Effective Multi-Clustering Anonymization Approach Using Discrete Component Task for Non-Binary High Dimensional Data Spaces


Affiliations
1 Department of Computer Science and Engineering, Manonmaniam Sundaranar University, India
2 Mookambika Technical Campus, India
     

   Subscribe/Renew Journal


Clustering is a process of grouping elements together, designed in such a way that the elements assigned to similar data points in a cluster are more comparable to each other than the remaining data points in a cluster. During clustering certain difficulties related when dealing with high dimensional data are ubiquitous and abundant. Works concentrated using anonymization method for high dimensional data spaces failed to address the problem related to dimensionality reduction during the inclusion of non-binary databases. In this work we study methods for dimensionality reduction for non-binary database. By analyzing the behavior of dimensionality reduction for non-binary database, results in performance improvement with the help of tag based feature. An effective multi-clustering anonymization approach called Discrete Component Task Specific Multi-Clustering (DCTSM) is presented for dimensionality reduction on non-binary database. To start with we present the analysis of attribute in the non-binary database and cluster projection identifies the sparseness degree of dimensions. Additionally with the quantum distribution on multi-cluster dimension, the solution for relevancy of attribute and redundancy on non-binary data spaces is provided resulting in performance improvement on the basis of tag based feature. Multi-clustering tag based feature reduction extracts individual features and are correspondingly replaced by the equivalent feature clusters (i.e.) tag clusters. During training, the DCTSM approach uses multi-clusters instead of individual tag features and then during decoding individual features is replaced by corresponding multi-clusters. To measure the effectiveness of the method, experiments are conducted on existing anonymization method for high dimensional data spaces and compared with the DCTSM approach using Statlog German Credit Data Set. Improved tag feature extraction and minimum error rate compared to conventional anonymization methods are demonstrated with experiments.

Keywords

High-Dimensional Data Space, Data points, Non-Binary Database, Quantum Distribution, Dimensionality Reduction.
Subscription Login to verify subscription
User
Notifications
Font Size

Abstract Views: 261

PDF Views: 0




  • An Effective Multi-Clustering Anonymization Approach Using Discrete Component Task for Non-Binary High Dimensional Data Spaces

Abstract Views: 261  |  PDF Views: 0

Authors

L. V. Arun Shalin
Department of Computer Science and Engineering, Manonmaniam Sundaranar University, India
K. Prasadh
Mookambika Technical Campus, India

Abstract


Clustering is a process of grouping elements together, designed in such a way that the elements assigned to similar data points in a cluster are more comparable to each other than the remaining data points in a cluster. During clustering certain difficulties related when dealing with high dimensional data are ubiquitous and abundant. Works concentrated using anonymization method for high dimensional data spaces failed to address the problem related to dimensionality reduction during the inclusion of non-binary databases. In this work we study methods for dimensionality reduction for non-binary database. By analyzing the behavior of dimensionality reduction for non-binary database, results in performance improvement with the help of tag based feature. An effective multi-clustering anonymization approach called Discrete Component Task Specific Multi-Clustering (DCTSM) is presented for dimensionality reduction on non-binary database. To start with we present the analysis of attribute in the non-binary database and cluster projection identifies the sparseness degree of dimensions. Additionally with the quantum distribution on multi-cluster dimension, the solution for relevancy of attribute and redundancy on non-binary data spaces is provided resulting in performance improvement on the basis of tag based feature. Multi-clustering tag based feature reduction extracts individual features and are correspondingly replaced by the equivalent feature clusters (i.e.) tag clusters. During training, the DCTSM approach uses multi-clusters instead of individual tag features and then during decoding individual features is replaced by corresponding multi-clusters. To measure the effectiveness of the method, experiments are conducted on existing anonymization method for high dimensional data spaces and compared with the DCTSM approach using Statlog German Credit Data Set. Improved tag feature extraction and minimum error rate compared to conventional anonymization methods are demonstrated with experiments.

Keywords


High-Dimensional Data Space, Data points, Non-Binary Database, Quantum Distribution, Dimensionality Reduction.