Open Access
Subscription Access
Open Access
Subscription Access
Collaborative Approach for Trend Analysis Using Clustering Mechanisms and Big Data Technologies
Subscribe/Renew Journal
The rapid growth in technologies and social media provides us the enormous amount of data, and it opens a wider window for researchers to work on such data. One of the critical analyses of the data is to check the changing trends in data. These days, massive volumes of data are being generated and processed using Hadoop and its ecosystem tools. These tools help in fast and efficient computing of a significant amount of data. In this paper, we collaborate few popular clustering algorithms with big data technologies to analyze the usage of mobile phones and networks in various locations. We loaded and processed this dataset in Apache Hive to examine the number of users and most prominent systems in given areas, based on their location codes. Further, we compared the time taken to build the clustered model on our framework to that on Weka tool. It was observed that Weka takes comparatively longer to process the dataset. This analysis would not only help in management and segregation of a considerable amount of data but would also help mobile service providers to understand the patterns of usage by customers and network problems, which may persist in some regions.
Keywords
Big Data, Clustering Methods, Machine Learning, Hive.
Subscription
Login to verify subscription
User
Font Size
Information
- M. Chen, A. Ludwig and K. Li, “Clustering in Big Data”, Available at: https://pdfs.semanticscholar.org/2ab0/d4ded091959f0ed7140b85c90bef49d9ab1b.pdf.
- M. Hajeer and D. Dasgupta, “Handling Big Data using a Data-Aware HDFS and Evolutionary Clustering Technique”, IEEE Transactions on Big Data, 2017.
- A. Elsayed, O. Ismail and M. El-Sharkawi, “MapReduce: State-of-the-Art and Research Directions”, International Journal of Computer and Electrical Engineering, Vol. 6, No. 1, pp. 34-39, 2014.
- R. Loohach and K. Garg, “Effect of Distance on K-means Clustering Algorithm”, International Journal of Computer Applications, Vol. 5, No. 2, pp. 7-9, 2012.
- I. Foster, C. Kesselman, J. Nick and S. Tuecke, “The Physiology of the Grid: an Open Grid Services Architecture for Distributed Systems Integration”, Technical Report, Department of Information Science, University of Southern California, 2002.
- Aayushi Bindal and Analp Pathak, “Survey on K-means Clustering and Web-Text Mining”, International Journal of Science and Research, Vol. 5, No. 4, pp. 1049-1052, 2016.
- Min Huang, Lei Yu and Ying Chen, “Improved K-Means Clustering Center Selecting Algorithm”, Information Engineering and Applications, pp. 373-379, 2012.
- S. Seo and K. Obermayer, “Self-Organizing Maps and Clustering Methods for Matrix Data”, Neural Networks, Vol. 17, No. 8-9, pp. 1211-1229, 2004.
- Y. Rani and D. Rohil, “A Study of Hierarchical Clustering Algorithm”, International Journal of Information and Computation Technology, Vol. 3, No. 10, pp. 1225-1232, 2013.
- C. Selvi and E. Sivasankar, “A Novel Optimization Algorithm for Recommender System using Modified Fuzzy C-means Clustering Approach”, Soft Computing, pp. 1-16, 2017.
- S. Mehta and V. Mehta, “Hadoop Ecosystem: An Introduction”, International Journal of Science and Research, Vol. 5, No. 6, pp. 557-562, 2017.
- R. Yadav and A. Sharma, “Advanced Methods to Improve Performance of K-Means Algorithm: A Review”, Global Journal of Computer Science and Technology, Vol. 12, No. 9, pp. 47-52, 2012.
- R.V. Singh and M.P.S Bhatia, “Data Clustering with Modified K-means Algorithm”, Proceedings of International Conference on Recent Trends in Information Technology, pp. 55-59, 2011.
Abstract Views: 279
PDF Views: 2