Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

Collaborative Approach for Trend Analysis Using Clustering Mechanisms and Big Data Technologies


Affiliations
1 Division of Computer Engineering, Netaji Subhas Institute of Technology, India
     

   Subscribe/Renew Journal


The rapid growth in technologies and social media provides us the enormous amount of data, and it opens a wider window for researchers to work on such data. One of the critical analyses of the data is to check the changing trends in data. These days, massive volumes of data are being generated and processed using Hadoop and its ecosystem tools. These tools help in fast and efficient computing of a significant amount of data. In this paper, we collaborate few popular clustering algorithms with big data technologies to analyze the usage of mobile phones and networks in various locations. We loaded and processed this dataset in Apache Hive to examine the number of users and most prominent systems in given areas, based on their location codes. Further, we compared the time taken to build the clustered model on our framework to that on Weka tool. It was observed that Weka takes comparatively longer to process the dataset. This analysis would not only help in management and segregation of a considerable amount of data but would also help mobile service providers to understand the patterns of usage by customers and network problems, which may persist in some regions.

Keywords

Big Data, Clustering Methods, Machine Learning, Hive.
Subscription Login to verify subscription
User
Notifications
Font Size

  • M. Chen, A. Ludwig and K. Li, “Clustering in Big Data”, Available at: https://pdfs.semanticscholar.org/2ab0/d4ded091959f0ed7140b85c90bef49d9ab1b.pdf.
  • M. Hajeer and D. Dasgupta, “Handling Big Data using a Data-Aware HDFS and Evolutionary Clustering Technique”, IEEE Transactions on Big Data, 2017.
  • A. Elsayed, O. Ismail and M. El-Sharkawi, “MapReduce: State-of-the-Art and Research Directions”, International Journal of Computer and Electrical Engineering, Vol. 6, No. 1, pp. 34-39, 2014.
  • R. Loohach and K. Garg, “Effect of Distance on K-means Clustering Algorithm”, International Journal of Computer Applications, Vol. 5, No. 2, pp. 7-9, 2012.
  • I. Foster, C. Kesselman, J. Nick and S. Tuecke, “The Physiology of the Grid: an Open Grid Services Architecture for Distributed Systems Integration”, Technical Report, Department of Information Science, University of Southern California, 2002.
  • Aayushi Bindal and Analp Pathak, “Survey on K-means Clustering and Web-Text Mining”, International Journal of Science and Research, Vol. 5, No. 4, pp. 1049-1052, 2016.
  • Min Huang, Lei Yu and Ying Chen, “Improved K-Means Clustering Center Selecting Algorithm”, Information Engineering and Applications, pp. 373-379, 2012.
  • S. Seo and K. Obermayer, “Self-Organizing Maps and Clustering Methods for Matrix Data”, Neural Networks, Vol. 17, No. 8-9, pp. 1211-1229, 2004.
  • Y. Rani and D. Rohil, “A Study of Hierarchical Clustering Algorithm”, International Journal of Information and Computation Technology, Vol. 3, No. 10, pp. 1225-1232, 2013.
  • C. Selvi and E. Sivasankar, “A Novel Optimization Algorithm for Recommender System using Modified Fuzzy C-means Clustering Approach”, Soft Computing, pp. 1-16, 2017.
  • S. Mehta and V. Mehta, “Hadoop Ecosystem: An Introduction”, International Journal of Science and Research, Vol. 5, No. 6, pp. 557-562, 2017.
  • R. Yadav and A. Sharma, “Advanced Methods to Improve Performance of K-Means Algorithm: A Review”, Global Journal of Computer Science and Technology, Vol. 12, No. 9, pp. 47-52, 2012.
  • R.V. Singh and M.P.S Bhatia, “Data Clustering with Modified K-means Algorithm”, Proceedings of International Conference on Recent Trends in Information Technology, pp. 55-59, 2011.

Abstract Views: 278

PDF Views: 2




  • Collaborative Approach for Trend Analysis Using Clustering Mechanisms and Big Data Technologies

Abstract Views: 278  |  PDF Views: 2

Authors

Shefali Arora
Division of Computer Engineering, Netaji Subhas Institute of Technology, India
Ruchi Mittal
Division of Computer Engineering, Netaji Subhas Institute of Technology, India
M.P.S Bhatia
Division of Computer Engineering, Netaji Subhas Institute of Technology, India

Abstract


The rapid growth in technologies and social media provides us the enormous amount of data, and it opens a wider window for researchers to work on such data. One of the critical analyses of the data is to check the changing trends in data. These days, massive volumes of data are being generated and processed using Hadoop and its ecosystem tools. These tools help in fast and efficient computing of a significant amount of data. In this paper, we collaborate few popular clustering algorithms with big data technologies to analyze the usage of mobile phones and networks in various locations. We loaded and processed this dataset in Apache Hive to examine the number of users and most prominent systems in given areas, based on their location codes. Further, we compared the time taken to build the clustered model on our framework to that on Weka tool. It was observed that Weka takes comparatively longer to process the dataset. This analysis would not only help in management and segregation of a considerable amount of data but would also help mobile service providers to understand the patterns of usage by customers and network problems, which may persist in some regions.

Keywords


Big Data, Clustering Methods, Machine Learning, Hive.

References