Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

A Survey on Duplicate Detection Approaches in Hierarchical Data


Affiliations
1 Pune Institute of Computer Technology, Pune, Maharashtra, India
2 Vishwakarma Institute of Technology, Pune, Maharashtra, India
     

   Subscribe/Renew Journal


Duplicate detection is the process of finding the duplicate objects in the data. This is the important part of data cleansing step of data mining. Significant amount of work has been done in duplicate detection of relational data, but only recently the researchers have shifted their focus towards duplicate detection in hierarchical and semi-structured data e.g. XML. In this paper we provide an overview of different methods for duplicate detection in hierarchical data and semi-structured data.

Keywords

Data Cleansing, Duplicate Detection, XML, Data Mining, Hierarchical Data.
User
Subscription Login to verify subscription
Notifications
Font Size

Abstract Views: 255

PDF Views: 2




  • A Survey on Duplicate Detection Approaches in Hierarchical Data

Abstract Views: 255  |  PDF Views: 2

Authors

Kiran Lokhande
Pune Institute of Computer Technology, Pune, Maharashtra, India
Tushar Rane
Pune Institute of Computer Technology, Pune, Maharashtra, India
S. T. Patil
Vishwakarma Institute of Technology, Pune, Maharashtra, India

Abstract


Duplicate detection is the process of finding the duplicate objects in the data. This is the important part of data cleansing step of data mining. Significant amount of work has been done in duplicate detection of relational data, but only recently the researchers have shifted their focus towards duplicate detection in hierarchical and semi-structured data e.g. XML. In this paper we provide an overview of different methods for duplicate detection in hierarchical data and semi-structured data.

Keywords


Data Cleansing, Duplicate Detection, XML, Data Mining, Hierarchical Data.