Open Access
Subscription Access
Open Access
Subscription Access
NoolOCR for Printed Tamil Text
Subscribe/Renew Journal
Optical Character Recognition (OCR) is a process of converting printed materials into text or word processing files that can be easily edited and stored. The technology has enabled such materials to be stored using much less storage space than the hard materials. OCR technology has made a huge impact on the way information is stored, shared and edited. Prior to optical character recognition, if someone wanted to turn a book into a word processing file, each page would have to be typed word for word. Now a days there are lot of OCR available in the market for different languages but there is no centralized framework for all languages. The intension of the paper is to create a framework capable to handle all available languages. This can be achieved through Eclipse plug-in architecture. So there will be a separate plug-in for different languages.
Keywords
Binarization, Bounding Box, GOCR, OCR, Tesseract.
User
Subscription
Login to verify subscription
Font Size
Information
Abstract Views: 203
PDF Views: 1