Open Access Open Access  Restricted Access Subscription Access

Rank-Frequency Analysis of Characters in Garhwali Text: Emergence of Zipf's Law


Affiliations
1 College of Forestry, VCSG Uttarakhand University of Horticulture and Forestry, Ranichauri, Tehri Garhwal 249 199, India
2 Ramanujan College, University of Delhi, New Delhi, India
 

Zipf's law is ubiquitous in a language system, which establishes a relation between rank and frequency of characters or words. In the present study, it is shown that the distribution of character frequencies for Garhwali language follows Zipf-Mandelbrot law. Garhwali language is an Indo-Aryan language, spoken in the Garhwal region of Uttarakhand, India (northwestern Himalayan belt of India). The present communication examines the rank-frequency distribution by generalization of Zipf-Mandelbrot law in Garhwali language having limited dictionary size. The study shows that the distribution of character frequencies of consonants (with matras), vowels (including vowels with consonants in shape of matras) and all characters (including vowels and consonants without matras) for continuous Garhwali corpus follows Zipf-Mandelbrot law.

Keywords

Garhwali, Frequency, Rank, Zipf’s Law, Zipf–Mandelbrot Law.
User
Notifications
Font Size

  • Gerlach, M. and Altmann, E. G., Stochastic model for the vocabulary growth in natural languages. Phys. Rev. X, 2013, 3; doi:10.1103/PhysRevX.3.021006.
  • Montemurro, M. A. and Zanette, D. H., Entropic analysis of the role of words in literary texts. Adv. Complex Syst., 2002, 5(1), 7–17.
  • Pande, H. and Dhami, H. S., Mathematical modelling of occurrence of letters and word’s initials in texts of Hindi language. J. Theor. Ling., 2010.
  • Wyllys, R. E., Empirical and theoretical bases of Zipf’s law. Libr. Trends, 1981, 30(1), 53-64.
  • Zipf, G. K., Human Behaviour and the Principal of Least Effort, Wesley, Reading, 1949.
  • Lü, L., Zhang, Z. and Zhou, T., Deviation of Zipf’s and Heaps’ laws in human languages with limited dictionary sizes. Sci. Rep., 2013, 3; doi:10.1038/srep01082.
  • Ha, L. Q., Sicilia-Garcia, E. I., Ming, J. and Smith, F. J., Extension of Zipf’s Law to word and character N-grams for English and Chinese. Comput. Ling. Chinese Lang. Proc., 2003, 8(1), 77–102.
  • Cancho, R. F. I. and Solé, R. V., Least effort and the origins of scaling in human language. Proc. Natl. Acad. Sci. USA, 2002, 100, 788–791.
  • Kanter, I. and Kessler, D. A., Markov processes: linguistics and Zipf’s law. Phys. Rev. Lett., 1995, 74, 4559–4562.
  • Jayaram, B. D. and Vidya, M. N., Zipf’s law for Indian Languages. J. Quant. Ling., 2008, 15(4), 293–317.
  • Li, W., Random texts exhibit Zipfs-law-like word frequency distribution. IEEE Trans. Inf. Theory, 1992, 38(6), 1842–1845.
  • Yadav, N., Joglekar, H., Rao, R. P. N., Vahia, M. N., Adhikari, R. and Mahadevan, I., Statistical analysis of the Indus script using n-grams. PLoS ONE, 2010, 5(3), e9506.
  • Tuzzi, A., Popescu, I. and Altmann, G., Zipf’s law Italian texts. J. Quant. Ling., 2009, 16(4), 354–367.
  • Zornig, P. and Altmann, G., Unified representation of Zipf’s distributions. Comput. Stat. Data Anal., 1995, 19, 461–473.
  • Gryzbek, P. and Kelih, E., Towards a general model of grapheme frequencies for Slavic languages. In Proceeding of Third International Seminar on Computer Treatment of Slavic and East European Languages, Bratislva, Slovakia, 2005.
  • Gryzbek, P., Kelih, E. and Stadlober E., Slavic letter frequencies: A common discrete model and regular parameter behaviour? Iss. Quant. Ling., 2009, 17–33.
  • Gunther, R., Levitin, L., Schapiro, B. and Wagner, P., Zipf’s law and the effect of ranking on probability distributions. Int. J. Theor. Phys., 1996, 35(2).
  • Popescu, I., Cech, R. and Altmann, G., The Lambda Structure of Texts, Ram-Verlag, 2011, pp. 181.
  • http://en.wikipedia.org/wiki/Garhwali_language 20. http://e-agazineofuttarakhand.blogspot.in/2009/10/garhwali-kumaonihimalayan-literature_6723.html
  • Wang, D. H., Li, M. H. and Di, Z. R., Ture reason for Zipf’s law in language. Physica A, 2005, 358, 545–550.
  • Lü, L., Zhang, Z.-K. and Zhou, T., Zipf’s law lwads to Heaps’ law: analysing their relation in finite-size systems. PLoS ONE, 2010, 5, e14139.
  • http://www.language-archives.org/language/gbm#language_descriptions

Abstract Views: 356

PDF Views: 171




  • Rank-Frequency Analysis of Characters in Garhwali Text: Emergence of Zipf's Law

Abstract Views: 356  |  PDF Views: 171

Authors

Manoj Kumar Riyal
College of Forestry, VCSG Uttarakhand University of Horticulture and Forestry, Ranichauri, Tehri Garhwal 249 199, India
Nikhil Kumar Rajput
Ramanujan College, University of Delhi, New Delhi, India
Vinod Prasad Khanduri
College of Forestry, VCSG Uttarakhand University of Horticulture and Forestry, Ranichauri, Tehri Garhwal 249 199, India
Laxmi Rawat
College of Forestry, VCSG Uttarakhand University of Horticulture and Forestry, Ranichauri, Tehri Garhwal 249 199, India

Abstract


Zipf's law is ubiquitous in a language system, which establishes a relation between rank and frequency of characters or words. In the present study, it is shown that the distribution of character frequencies for Garhwali language follows Zipf-Mandelbrot law. Garhwali language is an Indo-Aryan language, spoken in the Garhwal region of Uttarakhand, India (northwestern Himalayan belt of India). The present communication examines the rank-frequency distribution by generalization of Zipf-Mandelbrot law in Garhwali language having limited dictionary size. The study shows that the distribution of character frequencies of consonants (with matras), vowels (including vowels with consonants in shape of matras) and all characters (including vowels and consonants without matras) for continuous Garhwali corpus follows Zipf-Mandelbrot law.

Keywords


Garhwali, Frequency, Rank, Zipf’s Law, Zipf–Mandelbrot Law.

References





DOI: https://doi.org/10.18520/cs%2Fv110%2Fi3%2F429-434