Open Access Open Access  Restricted Access Subscription Access

Selecting Multiview Point Similarity from Different Methods of Similarity Measure to Perform Document Comparison


Affiliations
1 Department of Computer Science and Engineering, Sathyabama University, Chennai - 600119, Tamil Nadu, India
2 Faculty of computing, Sathyabama University, Chennai - 600119, Tamil Nadu, India
 

Objective: The main objective is to implement multi view point similarity to perform document comparisons that use the concept of clustering. Methods/Analysis: The main task of data mining is clustering which is used to group or select objects which are similar to one another. Data mining divides whole document into meaningful clusters and analyses data. There are many different types of clustering methods like hierarchical clustering, partitioned clustering and data grouping may be based on distance, viewpoints, Euclidean distance etc, Of these, the current system uses single view point similarity. This type of single view point similarity has some disadvantages. The main disadvantage is it does not use full set of document data so that detailed comparison measures cannot be revealed. In the future system multi viewpoint similarity is used to overcome the above disadvantage. Findings: The multi view point similarity method is used to overcome the disadvantages mentioned under the analysis. This method compares similarity between the multiple documents in detailed manner. The documents have been compared line by line and show the similarity. Then we have enhanced the existing ECSMTP algorithm and it is named as ECSMTP (Enhanced Concept Based Similarity Measure for Text Processing). This algorithm categorizes data from selected documents along with weight age of document, and based on that it forms clusters and calculates the similarity measure. Further in this system different kind of documents were compared like text documents, word, PDF documents etc., but it is not in the existing system. User may select kind of document and comparisons can be made on the selected documents. Clusters were formed and these clusters were compared.

Keywords

Clustering, ECSMTP, Multiviewpoint, Pattern Recognition, Singleview Point
User

Abstract Views: 224

PDF Views: 0




  • Selecting Multiview Point Similarity from Different Methods of Similarity Measure to Perform Document Comparison

Abstract Views: 224  |  PDF Views: 0

Authors

S. Kalpana
Department of Computer Science and Engineering, Sathyabama University, Chennai - 600119, Tamil Nadu, India
S. Vigneshwari
Faculty of computing, Sathyabama University, Chennai - 600119, Tamil Nadu, India

Abstract


Objective: The main objective is to implement multi view point similarity to perform document comparisons that use the concept of clustering. Methods/Analysis: The main task of data mining is clustering which is used to group or select objects which are similar to one another. Data mining divides whole document into meaningful clusters and analyses data. There are many different types of clustering methods like hierarchical clustering, partitioned clustering and data grouping may be based on distance, viewpoints, Euclidean distance etc, Of these, the current system uses single view point similarity. This type of single view point similarity has some disadvantages. The main disadvantage is it does not use full set of document data so that detailed comparison measures cannot be revealed. In the future system multi viewpoint similarity is used to overcome the above disadvantage. Findings: The multi view point similarity method is used to overcome the disadvantages mentioned under the analysis. This method compares similarity between the multiple documents in detailed manner. The documents have been compared line by line and show the similarity. Then we have enhanced the existing ECSMTP algorithm and it is named as ECSMTP (Enhanced Concept Based Similarity Measure for Text Processing). This algorithm categorizes data from selected documents along with weight age of document, and based on that it forms clusters and calculates the similarity measure. Further in this system different kind of documents were compared like text documents, word, PDF documents etc., but it is not in the existing system. User may select kind of document and comparisons can be made on the selected documents. Clusters were formed and these clusters were compared.

Keywords


Clustering, ECSMTP, Multiviewpoint, Pattern Recognition, Singleview Point



DOI: https://doi.org/10.17485/ijst%2F2016%2Fv9i10%2F131430