Open Access
Subscription Access
Open Access
Subscription Access
An Analysis of Various Record Matching Approaches and Similarity Computations
Subscribe/Renew Journal
Linking or matching databases is becoming increasingly important in many data mining projects, as linked data can contain information that is not available otherwise, or that would be too expensive to collect manually. Record matching refers to the task of finding similar entities in two or more records. Performing record matching solves the duplication detection problems; hence the needs for identifying the suitable record matching technique follow. This paper presents a survey on record matching techniques highlighting what approaches are utilized, the number of classifiers used, multiple stages of duplication detection performed, thus comparing each technique with other. This paper also exhibits the various matching metrics available. Further, we want to point out potential pitfalls as well as challenging issues need to be addressed by a record matching technique. And then we exhibit an unsupervised method to perform record matching on a web database scenario. We believe that the results of this evaluation will help analyst to come with more easier and feasible methods for record matching. This is a real challenging task particularly in Web scenario.
Keywords
Duplication Detection, Record Matching, Similarity Calculation, Unsupervised.
User
Subscription
Login to verify subscription
Font Size
Information
Abstract Views: 239
PDF Views: 3