The PDF file you selected should load here if your Web browser has a PDF reader plug-in installed (for example, a recent version of Adobe Acrobat Reader).

If you would like more information about how to print, save, and work with PDFs, Highwire Press provides a helpful Frequently Asked Questions about PDFs.

Alternatively, you can download the PDF file directly to your computer, from where it can be opened using a PDF reader. To download the PDF, click the Download link above.

Fullscreen Fullscreen Off


This paper demonstrated the outcomes of the research of a number of general document clustering and classification methods. Objectives: This research improves the clustering. Its objective is to create a system which reduces the retrieval time of text documents from clusters. Method: In this paper, we propose a new method supporting clustering and classification, using k-means with feed forward neural networks using MATLAB. We use k-mean for the clustering of text documents and neural networks for classification of text documents. Findings: Earlier various techniques have come up like semi supervised models for labelled text, namely Partially Labeled Dirichlet Allocation and the Partially Labeled Dirichlet Process, genetic algorithm, Guassian distribution, hybrid genetic algorithm, fast k means global, k-means clustering. But all these techniques have their merits as well as demerits and the common thing is that these techniques are very time consuming. That is why the main aim of the work is to develop the model based on supervised as well as unsupervised techniques to achieve the similarity between documents. Improvements: To remove that time consuming problem we used neural networks for classification and k-means for clustering. We developed a model based on supervised as well as unsupervised technique to achieve the similarity between documents.

Keywords

Artificial Neural Network, Cosine Similarity and Data Mining, K-mean Algorithm, Similarity Measure Function, Text Document Clustering.
User