Open Access Open Access  Restricted Access Subscription Access

Spectral Analysis of Projection Histogram for Enhancing Close Matching Character Recognition in Malayalam


Affiliations
1 University of Kerala, Thiruvananthapuram, Kerala, India
 

The success rates of Optical Character Recognition (OCR) systems for printed Malayalam documents is quite impressive with the state of the art accuracy levels in the range of 85-95% for various. However for real applications, further enhancement of this accuracy levels are required. One of the bottle necks in further enhancement of the accuracy is identified as close-matching characters. In this paper, we delineate the close matching characters in Malayalam and report the development of a specialised classifier for these close-matching characters. The output of a state of the art of OCR is taken and characters falling into the close-matching character set is further fed into this specialised classifier for enhancing the accuracy. The classifier is based on support vector machine algorithm and uses feature vectors derived out of spectral coefficients of projection histogram signals of close-matching characters.

Keywords

OCR, Malayalam, Close-Matching Characters, Feature Extraction, Pattern Classification.
User
Notifications
Font Size

Abstract Views: 181

PDF Views: 136




  • Spectral Analysis of Projection Histogram for Enhancing Close Matching Character Recognition in Malayalam

Abstract Views: 181  |  PDF Views: 136

Authors

Sajilal Divakaran
University of Kerala, Thiruvananthapuram, Kerala, India

Abstract


The success rates of Optical Character Recognition (OCR) systems for printed Malayalam documents is quite impressive with the state of the art accuracy levels in the range of 85-95% for various. However for real applications, further enhancement of this accuracy levels are required. One of the bottle necks in further enhancement of the accuracy is identified as close-matching characters. In this paper, we delineate the close matching characters in Malayalam and report the development of a specialised classifier for these close-matching characters. The output of a state of the art of OCR is taken and characters falling into the close-matching character set is further fed into this specialised classifier for enhancing the accuracy. The classifier is based on support vector machine algorithm and uses feature vectors derived out of spectral coefficients of projection histogram signals of close-matching characters.

Keywords


OCR, Malayalam, Close-Matching Characters, Feature Extraction, Pattern Classification.