Open Access
Subscription Access
An overview on the Use of Data Mining and Linguistics Techniques for Building Microblog-Based Early Detection Systems in the Healthcare Sector
The usage of Online Social Networks (OSN), such as Facebook and Twitter are becoming more and more popular in order to exchange and disseminate news and information in real-time. Twitter in particular allows the instant dissemination of short messages in the form of microblogs to followers. This Survey reviews literature to explore and examine the usage of how OSNs, such as the microblogging tool Twitter, can help in the detection of spreading epidemics. The paper highlights significant challenges in the field of Natural Language Processing (NLP) when using microblog based Early Disease Detection Systems. For instance, microblogging data is an unstructured collection of short messages (140 characters in Twitter), with noise and non-standard use of the English language. Hence, research is currently exploring the field of linguistics in order to determine the semantics of the text and uses data mining techniques in order to extract useful information for disease spread detection. Furthermore, the survey discusses applications and existing early disease detection systems based on OSNs and outlines directions for future research on improving such systems based on a combination of linguistics methods, data mining techniques and recommendation systems.
Keywords
Data Mining, Social Networks, Healthcare.
User
Font Size
Information
Abstract Views: 288
PDF Views: 143