Open Access Open Access  Restricted Access Subscription Access

Scene Text Extraction using Convolutional Neural Network with Amended MSER


Affiliations
1 Department of Computer Science and Engineering, College of Engineering, Guindy, Anna University, Chennai 600 025, TN, India

Content in the text format helps to communicate the relevant and specific information to users meticulously. A beneficial approach for extracting text from natural scene images is introduced which employs amended Maximally Stable Extremal Region (a-MSER) together with deep learning framework, You Only Look Once YOLOv2 network. The proposed system, a-MSER with Scene Text Extraction using Modified YOLOv2 Network (STEMYN), performs remarkably well byevaluating three publicly available datasets. The method a-MSER is used to identify the region of interest based on thevariation of MSER. This algorithm considers intensity changes between text and background very effectively. The drawbackof original YOLOv2, the poor detection rate for small-sized objects, is overcome by employing 1 × 1 layer with image sizeenhanced from 13 × 13 to 26 × 26. Focal loss is applied to improve upon the existing cross entropy classification loss ofYOLOv2. The repeated convolution layer in the steep layer of the original YOLOv2 is removed to reduce the networkcomplexity as it does not improve the system performance. Experimental results demonstrate that the proposed method isproductive in identifying text from natural scene images.
User
Notifications
Font Size

Abstract Views: 82




  • Scene Text Extraction using Convolutional Neural Network with Amended MSER

Abstract Views: 82  | 

Authors

Aparna Yegnaraman
Department of Computer Science and Engineering, College of Engineering, Guindy, Anna University, Chennai 600 025, TN, India
Valli S
Department of Computer Science and Engineering, College of Engineering, Guindy, Anna University, Chennai 600 025, TN, India

Abstract


Content in the text format helps to communicate the relevant and specific information to users meticulously. A beneficial approach for extracting text from natural scene images is introduced which employs amended Maximally Stable Extremal Region (a-MSER) together with deep learning framework, You Only Look Once YOLOv2 network. The proposed system, a-MSER with Scene Text Extraction using Modified YOLOv2 Network (STEMYN), performs remarkably well byevaluating three publicly available datasets. The method a-MSER is used to identify the region of interest based on thevariation of MSER. This algorithm considers intensity changes between text and background very effectively. The drawbackof original YOLOv2, the poor detection rate for small-sized objects, is overcome by employing 1 × 1 layer with image sizeenhanced from 13 × 13 to 26 × 26. Focal loss is applied to improve upon the existing cross entropy classification loss ofYOLOv2. The repeated convolution layer in the steep layer of the original YOLOv2 is removed to reduce the networkcomplexity as it does not improve the system performance. Experimental results demonstrate that the proposed method isproductive in identifying text from natural scene images.