Open Access Open Access  Restricted Access Subscription Access

Mining Issues in Traditional Indian Web Documents


Affiliations
1 Faculty of Computing, Chirala Engineering College, Chirala - 523157, Andhra Pradesh, India
 

Recent developments in information technology are mostly in areas where information, content creation and knowledge integration are the driving forces. Beginning with adjusting to complexities in internet and mobile communications, these developments are becoming significant sources of knowledge and expertise creators and this is where countries like India and China play a major role. Indian tradition is considered more than 5000 years old and proofs of some of this are available even now on written, oral and real forms like Mahabharata on text or Mohenjo-Daro-Harappa as structures. This study presents issues at extracting information from traditional Indian documents and a method of evaluating content as language, script and form of the web documents are significantly varied. The development is based on pixel level to make the approach generic and presents results for some basic issue at text level and how this can be extended to word and document level.

Keywords

Attribute Generation, Data Mining, Data Preparation, Information Extraction, Tradition, Voxel
User

Abstract Views: 227

PDF Views: 0




  • Mining Issues in Traditional Indian Web Documents

Abstract Views: 227  |  PDF Views: 0

Authors

Kolla Bhanu Prakash
Faculty of Computing, Chirala Engineering College, Chirala - 523157, Andhra Pradesh, India

Abstract


Recent developments in information technology are mostly in areas where information, content creation and knowledge integration are the driving forces. Beginning with adjusting to complexities in internet and mobile communications, these developments are becoming significant sources of knowledge and expertise creators and this is where countries like India and China play a major role. Indian tradition is considered more than 5000 years old and proofs of some of this are available even now on written, oral and real forms like Mahabharata on text or Mohenjo-Daro-Harappa as structures. This study presents issues at extracting information from traditional Indian documents and a method of evaluating content as language, script and form of the web documents are significantly varied. The development is based on pixel level to make the approach generic and presents results for some basic issue at text level and how this can be extended to word and document level.

Keywords


Attribute Generation, Data Mining, Data Preparation, Information Extraction, Tradition, Voxel



DOI: https://doi.org/10.17485/ijst%2F2015%2Fv8i32%2F122692