Survey on Big Data and Machine Intelligence Tools
Subscribe/Renew Journal
Data is growing at an exponential phase today that posing challenges in analyzing, handling and sharing. The task of choosing the correct machine learning tools for such huge datasets is a difficult task. Each tool have their own limitations. Traditional tools fail to perform real time processing of huge datasets. This paper is intended for the individuals those who are interested to know about machine intelligence tools and how they are related to perform big data analytics. We have given the overview of each tools that are available with their latest versions and releases. To begin with, we have started with the introduction to big data, Hadoop and machine intelligence techniques. Then we go to the machine intelligence tools and understand the application areas where they can be implemented. We discuss the key features of each tool and provide a comparative study of all the tools. So, this paper aims to help the users to choose or take decisions easily in choosing the tools.
Keywords
- International Data Corporation. Digital Universe Study. (2014). Retrieved from http://www.emc.com/leadership/ digital-universe/index.htm.
- Ancestry.com Fact Sheet. http://corporate.ancestry.com/ press/company-facts/.
- Landset, S. (2015). A survey of open source tools for machine learning with big data in the Hadoop ecosystem. Journal of Big Data, 2(24).
- Apache Hadoop. Retrieved from https://hadoop.apache.org/.
- Feller J., & Fitzgerald, B. (2002). Understanding open source software development. Addison-Wesley, London, Retrieved from http://dl.acm.org/citation. cfm?id=513726.
- MOA (Massive Online Analysis). Retrieved from http:// moa.cs.waikato.ac.nz/.
- Hellerstein, J. M., Schoppmann, F., Wang, D. Z., Fratkin, E, Welton, C., Feng, X., Li, K., & Kumar, A. (2012). The MADlib Analytics Library or MAD Skills. The SQL.In: VLDB Endowment, (pp. 1700-171).
- Dato Core. Retrieved from https://github.com/dato-code/ Dato-Core.
- O’Driscoll, A., Daugelaite, J., & Sleator, R. D. (2013). ‘Big data’, Hadoop and cloud computing in genomics. Journal of Biomedical Informatics, 46(5), 774-781
- Bellini, P., di Claudio, M., Nesi, P., & Rauch, N. (2013). Tassonomy and review of Big data solutions navigation. In Big Data Computing. Chapman and Hall/ CRC, Boca Raton, (pp. 57).
- Howell-Barber, H., Lawler, J. P., Joseph, A., & Narula, S. (2013). A study of cloud computing Software-as-aService (SaaS). Financial Firms. Cloud Computing, Special Issue.
- Foster, I., Yong, Z., Raicu, I., & Shiyong, L. (2008). Cloud computing and grid computing 360-degree compared. Grid Computing Environments Workshop, 2008. GCE’08, Austin, Texas., Retrieved from http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4738445.
- Lawton, G. (2008). Developing software online with platform-as-a-service technology. Computer, June, 41(6), 13-15.
- Bhardwaj S, Jain L, & Jain, S. (2010). Cloud computing: A study of infrastructure as a service (IAAS). International Journal of Engineering and Information Technology, 2(1), 60-63.
- Schutt, R., & O’Neil, C. (2013). Doing Data Science: Straight Talk from the Frontline. O’Reilly Media, Inc. Retrieved from http://dl.acm.org/citation. cfm?id=2544025.
- Laney, D. (2001). 3D data management: Controlling data volume, velocity and variety. META Group.
- Bekkerman, R., Bilenko, M., & Langford, J. (2011). Scaling up machine learning: Parallel and distributed approaches. Cambridge: Cambridge University Press.
- Dean, J., & Ghemawat, S. (2004). MapReduce: Simplified Data Processing on Large Clusters. In Proceedings of the 6th Symposium on Operating Systems Design and Implementation.
- Apache Hama. Retrieved from https://hama.apache.org/.http://www.skytree.net/machine-learning/why-do-machine-learning-big-data/
- http://mahout.apache.org/users/basics/algorithms.html
- http://spark.apache.org/mllib/
- http://scikit-learn.org/stable/#
- http://www.shogun-toolbox.org/page/features/
- http://accord-framework.net/intro.html.
- http://www.cloudera.com/developers/cloudera-labs.html
- http://oryx.io/.
- http://wiki.pentaho.com/display/DATAMINING/Data+ Mining+Algorithms+and+Tools+in+Weka
- https://en.wikipedia.org/wiki/Weka_(machine_learning).
- http://cs.stanford.edu/people/karpathy/convnetjs/index.html
- http://www.nvidia.com/object/cuda_home_new.html#sthash.0Vo1PF8C.dpuf.
- NVIDIA CUDA TOOLKIT 7.5” Release Notes for Windows, Linux and Mac OS, RN – 06722-001_ v7.5, September (2015). Retrieved from
- https://en.wikipedia.org/wiki/CUDA
Abstract Views: 264
PDF Views: 3