Open Access
Subscription Access
Open Access
Subscription Access
Sentence Boundary Detection Using Maximum Entropy Model
Subscribe/Renew Journal
Sentence boundary detection system has three independent applications (Rule-based, HMM, and Maximum Entropy). Maximum Entropy Model is the central part of this system, which achieved an error rate less than 2% on part of the Wall Street Journal (WSJ) Corpus with only eight binary features. The performance of the three applications is illustrated and discussed. Sentence boundary disambiguation is the task of identifying the sentence elements within a paragraph or an article. Because the sentence is the basic textual unit immediately above the word and phrase, Sentence Boundary Disambiguation (SBD) is one of the essential problems for many applications of Natural Language Processing – Parsing, Information Extraction, Machine Translation, and Document Summarizations. The accuracy of the SBD system will directly affect the performance of these applications. However, the past research work in this field has already achieved very high performance, and it is not very active now. The problem seems too simple to attract the attention of the researchers.
Keywords
Sentence Boundary Disambiguation, Maximum Entropy Model, Features, Generalized Iterative Scaling, Hidden Markov Model.
User
Subscription
Login to verify subscription
Font Size
Information
Abstract Views: 229
PDF Views: 1