Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

Design and Implementation of Devnagari Spell Checker based on Soundex Phonetic Similarity Concepts


Affiliations
1 Department of Computer Technology, Priyadarshini College of Engineering, Nagpur (MS), India
2 P.G.T.D. Computer Science & Engineering Department of S.G.B. Amravati University, Amravati (MS), India
     

   Subscribe/Renew Journal


Nowadays with the advent in Information Technology, in India where the majority of peoples are Hindi language speaking, a perfect Devnagari Spell Checker is required for word processing a document in Hindi language. The one of the challenging field is how to implement a perfect spell checker for the Hindi language for doing spell checking in the printed document as we generally do for English like language in Microsoft word.

The proposed approach consist of a development of Hindi word database using Unicode standard for character encoding available for Devnagari character set and a spell check engine which will match the word from the available database of words and then for non-word, it presents a list of most appropriate threshold number of suggestions based on Soundex Phonetic string matching algorithm along with Levenstein‟s Edit distance calculation methods. The Soundex Phonetic string matching algorithm works on the some predefined rules where the entire language character set is divided among some category. The phonetically similar characters are present in a single category. The consonants and additional consonants are only considered while forming the categories and ignoring the vowels & special symbols. The limitation of Soundex Phonetic algorithm is removed by applying the Levenstein‟s Edit distance calculation method which calculates the distance between two strings and the minimum distance are always considered for ranking of the suggestions.


Keywords

Devnagari Script, Levenstein‟s Edit Distance, Soundex Phonetic String Matching Algorithm, Unicode Conventions, Suggestions Generation, Ranking Algorithms, Corpus Design.
User
Subscription Login to verify subscription
Notifications
Font Size

Abstract Views: 203

PDF Views: 1




  • Design and Implementation of Devnagari Spell Checker based on Soundex Phonetic Similarity Concepts

Abstract Views: 203  |  PDF Views: 1

Authors

Shaikh Phiroj Chhaware
Department of Computer Technology, Priyadarshini College of Engineering, Nagpur (MS), India
Mohammad Atique
P.G.T.D. Computer Science & Engineering Department of S.G.B. Amravati University, Amravati (MS), India

Abstract


Nowadays with the advent in Information Technology, in India where the majority of peoples are Hindi language speaking, a perfect Devnagari Spell Checker is required for word processing a document in Hindi language. The one of the challenging field is how to implement a perfect spell checker for the Hindi language for doing spell checking in the printed document as we generally do for English like language in Microsoft word.

The proposed approach consist of a development of Hindi word database using Unicode standard for character encoding available for Devnagari character set and a spell check engine which will match the word from the available database of words and then for non-word, it presents a list of most appropriate threshold number of suggestions based on Soundex Phonetic string matching algorithm along with Levenstein‟s Edit distance calculation methods. The Soundex Phonetic string matching algorithm works on the some predefined rules where the entire language character set is divided among some category. The phonetically similar characters are present in a single category. The consonants and additional consonants are only considered while forming the categories and ignoring the vowels & special symbols. The limitation of Soundex Phonetic algorithm is removed by applying the Levenstein‟s Edit distance calculation method which calculates the distance between two strings and the minimum distance are always considered for ranking of the suggestions.


Keywords


Devnagari Script, Levenstein‟s Edit Distance, Soundex Phonetic String Matching Algorithm, Unicode Conventions, Suggestions Generation, Ranking Algorithms, Corpus Design.