Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

Optical Character Recognition Using Hybrid Classifiers


Affiliations
1 Avanthi's Scientific Technological and Research Academy, India
2 ASRA, Hyderabad, India
     

   Subscribe/Renew Journal


Optical character recognition (OCR) refers to a process whereby printed documents are transformed into ASCII files for the purpose of compact storage, editing, fast retrieval, and other file manipulations through the use of a computer. The principle motivation for the development of OCR Systems is the need to cope with the enormous flood of paper in the form of documents, bank cheques, commercial forms, government records, credit card imprints and mail sorting, generated by expanding technological society.
A method has been developed for single font clear printed documents. This system is primarily designed for Telugu and used the Uniform Sampling Method as the basis for extraction of low-level, structural and stroke-type features and also used the nearest neighbor classifier for classification. The accuracy rate was 96%. The Objective of the current project is to improve the accuracy using different types of hybrid classifiers. This algorithm used segmentation process to isolate words. In this process the process of clipping has been applied to by deleting al zero rows and columns of the image matrix. K-means clusting algorithm is used to determine cluster k and centroids.

Keywords

Nearest Neighbor, K Means Algorithm, Centroids, Filters, Mmse, Skewing, Clipping, Training Patterns, Feature Extraction, Veronoi Diagram, Uniform Sampling.
User
Subscription Login to verify subscription
Notifications
Font Size

Abstract Views: 151

PDF Views: 2




  • Optical Character Recognition Using Hybrid Classifiers

Abstract Views: 151  |  PDF Views: 2

Authors

V. S. Giridhar Akula
Avanthi's Scientific Technological and Research Academy, India
D. Sreenivasa Rao
ASRA, Hyderabad, India
S. Sravanthi
ASRA, Hyderabad, India

Abstract


Optical character recognition (OCR) refers to a process whereby printed documents are transformed into ASCII files for the purpose of compact storage, editing, fast retrieval, and other file manipulations through the use of a computer. The principle motivation for the development of OCR Systems is the need to cope with the enormous flood of paper in the form of documents, bank cheques, commercial forms, government records, credit card imprints and mail sorting, generated by expanding technological society.
A method has been developed for single font clear printed documents. This system is primarily designed for Telugu and used the Uniform Sampling Method as the basis for extraction of low-level, structural and stroke-type features and also used the nearest neighbor classifier for classification. The accuracy rate was 96%. The Objective of the current project is to improve the accuracy using different types of hybrid classifiers. This algorithm used segmentation process to isolate words. In this process the process of clipping has been applied to by deleting al zero rows and columns of the image matrix. K-means clusting algorithm is used to determine cluster k and centroids.

Keywords


Nearest Neighbor, K Means Algorithm, Centroids, Filters, Mmse, Skewing, Clipping, Training Patterns, Feature Extraction, Veronoi Diagram, Uniform Sampling.