Open Access
Subscription Access
An Improved Method for Document Image Binarization
Handwriting analysis of document image has four parts- preprocessing, segmentation, feature extraction and classification. Image pre-processing technique is used to improve the quality of the image for easily and efficiently processing in future steps. Principal stage of image pre-processing is binarization, according to which the pixels are classified into text and background. It is a crucial stage that can affect further stages including the final character recognition stage. This paper proposed a binarization technique which is based on Otsu which has been already used for handwriting document binarization. But in order to tolerate badly degraded document images, present work proposed a binarization technique with the help of Otsu algorithm, which can segment the foreground from the background if text document is badly degraded, such as uneven illumination, image contrast variation, bleeding-through, and smear. The proposed method was tested on text image of H-DIBCO2012 and DIBCO2009. Experimental results show that proposed technique achieved a high precision that gives better result than the Otsu algorithm.
Keywords
Binarization, Gray Scale Image, Line Segment, Otsu, Threshold.
User
Font Size
Information
- Bolan S, Lu S, Tan CL. Robust document image binarization technique for degraded document images. IEEE Transactions on Image Processing. 2013; 22(4):1408–17.
- Otsu N. A threshold selection method from gray-scale histogram. IEEE Trans Systems, Man, and Cybernetics. 1978; 8:62–6.
- Kittler J, Illingworth J. On threshold selection using clustering criteria. IEEE Trans Systems, Man, and Cybernetics. 1985; 15:652–5.
- Lee SU, Chung SY. A comparative performance study of several global thresholding techniques for segmentation. Computer Vision, Graphics, and Image Processing. 1990; 52:171–90.
- Bolan S, Lu S, Tan CL. Combination of document image binarization techniques. 2011 IEEE International Conference on Document Analysis and Recognition (ICDAR); Beijing. 2011 Sep 18-21. p. 22–6
- Lu S, Su B, Tan C. Document image binarization using background estimation and stroke edges. International Journal on Document Analysis and Recognition. 2010 Dec; 13:303–14.
- DIBCO 2009 (Document Image Binarization Contest) image dataset.
- H-DIBCO 2012 (Handwritten Document Image Binarization Contest) image dataset.
- Gill TK. Document image binarization techniques- a review. International Journal of Computer Applications. 2014 Jul; 98(12).
- Shaikh SH, Maiti A, Chaki N. Image binarization using iterative partitioning: A global thresholding approach. International Conference on IEEE Recent Trends in Information Systems (ReTIS); Kolkata. 2011 Dec 21-23. p. 281–6.
- Gupta MR, Jacobson NP, Garcia EK. OCR binarization and image pre-processing for searching historical documents. The Journal of the Pattern Recognition Society. 2007; 40(2):389–97.
- Bal A, Saha R. An efficient method for skew normalization of handwriting image. 6th IEEE International Conference on Communication Systems and Network Technologies; Chandigarh. 2016. p. 222–8. ISBN: 978-1-4673-9950-0.
- Bal A, Saha R. An improved method for text segmentation and skew normalization of handwriting image. 4th Springer International Conference on Advanced Computing, Networking, and Informatics (ICACNI-2016); India: National Institute of Technology Rourkela. 2016 Sep 22-24. ISSN: 1876-1100.
- Niblack W. An introduction to digital image processing. Englewood Cliffs: Prentice Hall; 1986.
- Sauvola J, Seppanen T, Haapakoski S, Pietikainen M. Adaptive document binarization. 4th Int Conf on Document Analysis and Recognition; Ulm, Germany. 1997. p. 147–52.
Abstract Views: 923
PDF Views: 403