Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

Mining of Text Data with the Application of Side Information-A Review


Affiliations
1 Computer Science and Engineering, W.C.O.E.M., Nagpur, India
     

   Subscribe/Renew Journal


In Text Mining, Side Information is present with the Text Documents, much of the data in those files consists of unstructured Text. Side information is used in various Text Mining applications such as user-access behavior from web logs, document origin information, other non-textual attributes which are embedded into the text or the links in the document. These could play a significant role for clustering process. However, some of the information may contain noisy data which may lead in increase in the level of difficulty for estimation of importance of side information. In such condition it is risky to use the side information in mining process because it may result in the improvement of mining process or may add noise to the process. Therefore to maximize the benefit of using side information in mining text data, we need a principled way to perform the mining process.
Subscription Login to verify subscription
User
Notifications
Font Size


Abstract Views: 291

PDF Views: 0




  • Mining of Text Data with the Application of Side Information-A Review

Abstract Views: 291  |  PDF Views: 0

Authors

Garima Singh
Computer Science and Engineering, W.C.O.E.M., Nagpur, India
Neha Tiwari
Computer Science and Engineering, W.C.O.E.M., Nagpur, India

Abstract


In Text Mining, Side Information is present with the Text Documents, much of the data in those files consists of unstructured Text. Side information is used in various Text Mining applications such as user-access behavior from web logs, document origin information, other non-textual attributes which are embedded into the text or the links in the document. These could play a significant role for clustering process. However, some of the information may contain noisy data which may lead in increase in the level of difficulty for estimation of importance of side information. In such condition it is risky to use the side information in mining process because it may result in the improvement of mining process or may add noise to the process. Therefore to maximize the benefit of using side information in mining text data, we need a principled way to perform the mining process.