Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

A Novel Method for Clustering Words in Micro-Blogs Texts and its Application to Event Discovery


Affiliations
1 Department of CSE, Sanketika Institute of Technology and Management (SITAM), Visakhapatnam, India
     

   Subscribe/Renew Journal


This paper exhibits a novel method for clustering words in small scale online journals, in light of the likeness of the related fleeting arrangement. Our method, named SAX*, utilizes the Symbolic Aggregate ApproXimation calculation to discredited the fleeting arrangement of terms into a little arrangement of levels, prompting a string for each and then characterize a subset of "fascinating" strings, i.e. those speaking to examples of aggregate consideration. Sliding worldly windows are utilized to distinguish co-happening groups of tokens with the same or comparative string. To survey the execution of the method, first tune the model parameters on a 2-month 1 % Twitter stream, amid which various around the world occasions of contrasting sort and length (sports, legislative issues, calamities, wellbeing, and famous people) happened. At that point, assess the nature of every single found occasion in a 1-year stream, "goggling" with the most successive bunch n-grams and physically surveying what number of bunches compare to distributed news in a similar fleeting space. At long last, play out a unpredictability assessment and contrast SAX* and three alternative methods for occasion revelation. Our assessment demonstrates that SAX* is no less than one request of extent less complex than other fleeting and non-transient ways to deal with smaller scale blog bunching.


Keywords

Event Detection, Temporal Mining, Symbolic Aggregate, Approximation Micro-Blog Analysis.
User
Subscription Login to verify subscription
Notifications
Font Size

Abstract Views: 272

PDF Views: 2




  • A Novel Method for Clustering Words in Micro-Blogs Texts and its Application to Event Discovery

Abstract Views: 272  |  PDF Views: 2

Authors

B. Ramana Babu
Department of CSE, Sanketika Institute of Technology and Management (SITAM), Visakhapatnam, India

Abstract


This paper exhibits a novel method for clustering words in small scale online journals, in light of the likeness of the related fleeting arrangement. Our method, named SAX*, utilizes the Symbolic Aggregate ApproXimation calculation to discredited the fleeting arrangement of terms into a little arrangement of levels, prompting a string for each and then characterize a subset of "fascinating" strings, i.e. those speaking to examples of aggregate consideration. Sliding worldly windows are utilized to distinguish co-happening groups of tokens with the same or comparative string. To survey the execution of the method, first tune the model parameters on a 2-month 1 % Twitter stream, amid which various around the world occasions of contrasting sort and length (sports, legislative issues, calamities, wellbeing, and famous people) happened. At that point, assess the nature of every single found occasion in a 1-year stream, "goggling" with the most successive bunch n-grams and physically surveying what number of bunches compare to distributed news in a similar fleeting space. At long last, play out a unpredictability assessment and contrast SAX* and three alternative methods for occasion revelation. Our assessment demonstrates that SAX* is no less than one request of extent less complex than other fleeting and non-transient ways to deal with smaller scale blog bunching.


Keywords


Event Detection, Temporal Mining, Symbolic Aggregate, Approximation Micro-Blog Analysis.