Open Access Open Access  Restricted Access Subscription Access

Cleaning and Mapping Interface


Affiliations
1 Student, Chhatrapati Shahu Ji Maharaj University, Kalyanpur, Kanpur - 208024, Uttar Pradesh, India

   Subscribe/Renew Journal


With the growth in data day by day, it is getting harder to collect data in a clean format. Data collected for different purposes from sources contains more noise than the data required for the purpose. This research focuses on the cleaning of data and making it meaningful to use by building an annotator which reduces manual labor and makes annotation more efficient. I built a Cleaning and Mapping Interface which detects misspelled word and corrects it, even helps in predicting the next words based on its previous knowledge. It can be used by users with no knowledge of any library or functional programming to annotate their data. It is a user-friendly software that is easy to use. The application is written in Python language with a simple user interface implemented using the Tkinter module. The program has the ability to learn new grammar. The algorithm is recursively applied to deal with the same sentences (that are matched before correcting them) in the document, making it less time consuming and efficient. Cleaning and Mapping Interface uses algorithms for recurrent neural networks and Naive Bayes for making predictions. It provides the user with correct spelling and the next word for a respective word, so annotation can be done faster.

Keywords

Long Short Term memory, Recurrent Neural Network, Naive Bayes.

Manuscript Received: January 5, 2019; Revised: January 20, 2019; Accepted: January 25, 2019. Date of Publication: March 6, 2019.

User
Subscription Login to verify subscription
Notifications
Font Size

Abstract Views: 256

PDF Views: 0




  • Cleaning and Mapping Interface

Abstract Views: 256  |  PDF Views: 0

Authors

Aakansha Dhawan
Student, Chhatrapati Shahu Ji Maharaj University, Kalyanpur, Kanpur - 208024, Uttar Pradesh, India

Abstract


With the growth in data day by day, it is getting harder to collect data in a clean format. Data collected for different purposes from sources contains more noise than the data required for the purpose. This research focuses on the cleaning of data and making it meaningful to use by building an annotator which reduces manual labor and makes annotation more efficient. I built a Cleaning and Mapping Interface which detects misspelled word and corrects it, even helps in predicting the next words based on its previous knowledge. It can be used by users with no knowledge of any library or functional programming to annotate their data. It is a user-friendly software that is easy to use. The application is written in Python language with a simple user interface implemented using the Tkinter module. The program has the ability to learn new grammar. The algorithm is recursively applied to deal with the same sentences (that are matched before correcting them) in the document, making it less time consuming and efficient. Cleaning and Mapping Interface uses algorithms for recurrent neural networks and Naive Bayes for making predictions. It provides the user with correct spelling and the next word for a respective word, so annotation can be done faster.

Keywords


Long Short Term memory, Recurrent Neural Network, Naive Bayes.

Manuscript Received: January 5, 2019; Revised: January 20, 2019; Accepted: January 25, 2019. Date of Publication: March 6, 2019.




DOI: https://doi.org/10.17010/ijcs%2F2019%2Fv4%2Fi2%2F144272