Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

Document Image Registration for Imposed Layer Extraction


Affiliations
1 Department of Computer Science and Engineering, B.N.M Institute of Technology, India
     

   Subscribe/Renew Journal


Extraction of filled-in information from document images in the presence of template poses challenges due to geometrical distortion. Filled-in document image consists of null background, general information foreground and vital information imposed layer. Template document image consists of null background and general information foreground layer. In this paper a novel document image registration technique has been proposed to extract imposed layer from input document image. A convex polygon is constructed around the content of the input and the template image using convex hull. The vertices of the convex polygons of input and template are paired based on minimum Euclidean distance. Each vertex of the input convex polygon is subjected to transformation for the permutable combinations of rotation and scaling. Translation is handled by tight crop. For every transformation of the input vertices, Minimum Hausdorff distance (MHD) is computed. Minimum Hausdorff distance identifies the rotation and scaling values by which the input image should be transformed to align it to the template. Since transformation is an estimation process, the components in the input image do not overlay exactly on the components in the template, therefore connected component technique is applied to extract contour boxes at word level to identify partially overlapping components. Geometrical features such as density, area and degree of overlapping are extracted and compared between partially overlapping components to identify and eliminate components common to input image and template image. The residue constitutes imposed layer. Experimental results indicate the efficacy of the proposed model with computational complexity. Experiment has been conducted on variety of filled-in forms, applications and bank cheques. Data sets have been generated as test sets for comparative analysis.

Keywords

Document Image Registration, Template, Input Image, Convex Hull, Minimum Hausdorff Distance, Connected Component Analysis.
Subscription Login to verify subscription
User
Notifications
Font Size

  • Barbara Zitvoa and Jan Flusser, “Image Registration Methods: A Survey”, Image and Vision Computing, Vol. 21, No. 11, pp. 977-1000, 2003
  • Lawrence O’Gorman and Rangachar Kasturi, “Document Image Analysis”, IEEE Computer Society Executive Briefings, 2009.
  • Anoop M. Namboodiri and Anil K. Jain, “Document Structure and Layout Analysis”, Advances in Pattern Recognition, Springer, 2007
  • Chao Sun and Ronghai Cai, “Document Image Registration using Geometric Invariance and Hausdorff distance”, Proceedings of 1st International Workshop on Education Technology and Computer Science, pp. 725-728, 2009.
  • Tong Lijing, Zhang Yan and Zhao Huiqun, “A Warped Document Image Mosaicing Based on Registration and TRS Transform”, Proceedings of 10th International Conference on Computer and Information Science, pp. 179-183, 2011.
  • Lijing Tong, Guoliang Zhan, Quanyao Peng, Yang Li and Yifan Li, “Warped Document Image Mosaicing Method based on Inflection Point Detection and Registration”, Proceedings of 4th International Conference on Multimedia Information and Network Security, pp. 306-310, 2012.
  • Zhanlong Hao and Youbin Chen, “Table Image Registration based on Gradient Projection”, IEEE Global High Tech Congress on Electronics, pp. 6-9, 2012.
  • Venkata Gopal Edupuganti, Vinayak A Agarwal and Suryaprakash Kompalli, “Registration of Camera Captured Documents under Non-rigid Deformation”, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 385-392, 2011.
  • Hanchuan Peng, Fuhui Long, Zheru Chi and Wan-Chi Siu, “Document Image Template Matching based Component Block List”, Pattern Recognition Letters, Vol. 22, No. 9, pp. 1033-1042, 2001.
  • Hunchuan Peng, Fuhui Long, Wan-Chi Siu, Zheru Chi and David Dagan Feng, “Document Image Matching based on Component Blocks”, Proceedings of International Conference on Image Processing, pp. 601-604, 2000.
  • George Wolberg and Siavash Zokai, “Robust Image Registration using Log-Polar Form”, Proceedings of International Conference on Image Processing, pp. 493-496, 2000.
  • Luke A. D. Hutchison and William A. Barrett, “Fast Registration of Tabular Document Images using the Fourier-Mellin Transform”, Proceedings of 1st International Workshop on Document Image Analysis for Libraries, pp. 1-15, 2004.
  • Daniel P. Huttenlocher, Gregory A. Klanderman and Willian J Rucklidge, “Comparing Images using Hausdorff Distance”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 15, No. 9, pp. 850-863, 1993.
  • Michael D. Garris, “Intelligent System for Reading Handwriting on Forms”, Proceedings of 31st Hawaii International Conference on System Sciences, pp. 1-10, 1998
  • Raymond H Chan, Chung-Wa Ho and Mila Nikolova, “Salt and Pepper Noise Removal by Median Type Noise Detectors and Detail-Preserving Regularization”, IEEE Transactions on Image Processing, Vol. 14, No. 10, pp. 1479-1485, 2005.
  • Nobuyuki Otsu, “A Threshold Selection method from Gray-level Histograms”, IEEE Transactions on Systems, Man and Cybernetics, Vol. 9, No. 1, pp 62-66, 1979.
  • Liu Wenyin and Dov Dori, “From Raster to Vectors: Extracting Visual Information from Line Drawings”, Pattern Analysis and Application, Vol. 9, No. 1, pp. 10-21, 1999.

Abstract Views: 248

PDF Views: 1




  • Document Image Registration for Imposed Layer Extraction

Abstract Views: 248  |  PDF Views: 1

Authors

Surabhi Narayan
Department of Computer Science and Engineering, B.N.M Institute of Technology, India
Sahana D. Gowda
Department of Computer Science and Engineering, B.N.M Institute of Technology, India

Abstract


Extraction of filled-in information from document images in the presence of template poses challenges due to geometrical distortion. Filled-in document image consists of null background, general information foreground and vital information imposed layer. Template document image consists of null background and general information foreground layer. In this paper a novel document image registration technique has been proposed to extract imposed layer from input document image. A convex polygon is constructed around the content of the input and the template image using convex hull. The vertices of the convex polygons of input and template are paired based on minimum Euclidean distance. Each vertex of the input convex polygon is subjected to transformation for the permutable combinations of rotation and scaling. Translation is handled by tight crop. For every transformation of the input vertices, Minimum Hausdorff distance (MHD) is computed. Minimum Hausdorff distance identifies the rotation and scaling values by which the input image should be transformed to align it to the template. Since transformation is an estimation process, the components in the input image do not overlay exactly on the components in the template, therefore connected component technique is applied to extract contour boxes at word level to identify partially overlapping components. Geometrical features such as density, area and degree of overlapping are extracted and compared between partially overlapping components to identify and eliminate components common to input image and template image. The residue constitutes imposed layer. Experimental results indicate the efficacy of the proposed model with computational complexity. Experiment has been conducted on variety of filled-in forms, applications and bank cheques. Data sets have been generated as test sets for comparative analysis.

Keywords


Document Image Registration, Template, Input Image, Convex Hull, Minimum Hausdorff Distance, Connected Component Analysis.

References