Clustering Approach in Context Free Data Cleaning

Sohil D. Pandya; Paresh V. Virparia

Clustering Approach in Context Free Data Cleaning

Sohil D. Pandya , Paresh V. Virparia

Subscribe/Renew Journal

Abstract
References
Article Metrics
Refbacks

In this era of Knowledge, organizations can gain competitive advantage only by proficient data analysis. This paper emphasizes on application of clustering in context free data cleaning by correcting values of attributes, using various sequence similarity metrics, where reference data set is not available, to improve the quality of data which in turn lead to eminent data analysis. Authors propose an algorithm to examine suitability of value to correct other values of attributes. Various sequence similarity metrics were used, to find distance of two values of attributes, to test the data and generate results. Experimental results show how the approach can effectively clean the data without reference data.

Keywords

Clustering, Context Free Data Cleaning, Sequence Similarity Metrics

I-Scholar

Journal Help

Subscription Login to verify subscription

User

Notifications

Journal Content
Browse

Font Size

Information

Hui Xiong, Gaurav Pandey, Michael Steinbach, Vipin Kumar “Enhancing Data Analysis with Noise Removal” in IEEE Transactions on Knowledge and Data Engineering, Vol. 18, No. 3, pp. 304-319, March 2006.

Lukasz Ciszak “Application of Clustering and Association Methods in Data Cleaning”, in Proc. of Int. Multiconference on Computer Science and Information Technology, Vol. 3, pp. 97-103, 2008.

Sohil D Pandya, Dr. Paresh V Virparia “Data Cleaning in Knowledge Discovery in Databases: Various Approaches”, in Proc. of National Seminar on Current Trends in IT (CTICT) – 2009, February 2009.

W Cohen, P Ravikumar, S Fienberg “A Comparison of String Distance Metrics for Name-Matching Tasks” in Proc. of the IJCAI-2003

http://en.wikipedia.org/

http://www. dcs.shef.ac.uk/~sam/simmetric.html

Abstract Views: 375

PDF Views: 2

Username
Password
Remember me

Username
Password
Remember me

National Journal of System and Information Technology

National Journal of System and Information Technology

Clustering Approach in Context Free Data Cleaning

Subscribe/Renew Journal

Keywords

Clustering Approach in Context Free Data Cleaning

Authors

Abstract

Keywords

References