Identification of Telugu, Devanagari and English Scripts Using Discriminating Features

M. C. Padma; P. A. Vijaya

The PDF file you selected should load here if your Web browser has a PDF reader plug-in installed (for example, a recent version of Adobe Acrobat Reader).

If you would like more information about how to print, save, and work with PDFs, Highwire Press provides a helpful Frequently Asked Questions about PDFs.

Alternatively, you can download the PDF file directly to your computer, from where it can be opened using a PDF reader. To download the PDF, click the Download link above.

Fullscreen Fullscreen Off

Abstract
References
Article Metrics
Refbacks

In a multi-script multi-lingual environment, a document may contain text lines in more than one script/language forms. It is necessary to identify different script regions of the document in order to feed the document to the OCRs of individual language. With this context, this paper proposes to develop a model to identify and separate text lines of Telugu, Devanagari and English scripts from a printed trilingual document. The proposed method uses the distinct features extracted from the top and bottom profiles of the printed text lines. Experimentation conducted involved 1500 text lines for learning and 900 text lines for testing. The performance has turned out to be 99.67%.

Keywords

Multi-Script Multi-Lingual Document, Script Identification, Feature Extraction.

I-Scholar

Journal Help

User

Notifications

Journal Content
Browse

Font Size

Information

Username
Password
Remember me

Username
Password
Remember me

AIRCC's International Journal of Computer Science and Information Technology

AIRCC's International Journal of Computer Science and Information Technology

Keywords