Open Access
Subscription Access
Quantitative Analysis of English Corpus in Tourism and Health Domain
Statistical analysis of a language is an essential part of any of the natural language processing activity though it is translation, transliteration, summarization, lexicon formation, keyboard designs and many more. In this paper, a domain specific corpus (health and tourism) of English language provided by Computational Linguistic R & D at Special Centre for Sanskrit Studies J.N.U is analyzed statistically. The frequency analysis and word length analysis of English text is performed. Unigram, bigram, trigram and positional analysis of words has been studied.
Keywords
Corpus, English, Statistical Analysis, Quantitative Analysis, Unigram, Bigram, Trigram Introduction.
User
Font Size
Information
Abstract Views: 192
PDF Views: 0