Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

A Survey on Big Data Security:Issues, Challenges and Techniques


Affiliations
1 M. S. Ramaiah Institute of Technology, Bangalore, Karnataka., India
2 Dept. of Computer Science Applications, M. S. Ramaiah Institute of Technology, Bangalore, Karnataka, India
     

   Subscribe/Renew Journal


As the big data world is increasing day by day, the concern for its security is also becoming one of the major concerns in today’s technology. Advances which took place recently have brought big data in huge demand which have given rise to the data to be outsourced as well as the third party dealing with the business applications. This has also caused the security and the privacy of big data to become a concern factor. This paper provides a thorough and comprehensive review of existing and proposed security and privacy issues in the environment of big data. The work done in this paper leads to the identification of five attributes of security and privacy which are confidentiality, integrity, availability, privacy-preservability, and accountability. In this research paper, the main focus is on the security issues related to big data, hadoop, map reduce framework, HDFS.

Keywords

Big Data, Blockchain, Map Reduce, Privacy, Security.
Subscription Login to verify subscription
User
Notifications
Font Size


  • Z. Xiao, and Y. Xiao, “Achieving accountable MapReduce in cloud computing,” Future Generation Computer Systems, vol. 30, pp. 1-13, 2014.
  • W. Zing, Y. Yang, and B. Leo, “Access control for big data using data content,” IEEE International Conference on Big Data, vol. 122, pp. 55-70, 2013.
  • Y. Liddell, and B. Pinkas, “Privacy preserving data mining,” ACM SIGMOD Record, vol. 29, no. 2, pp. 456-989, August 2015.
  • C. C. Aggarwal, and P. S. Yu, “A general survey of privacy-preserving data mining models and algorithms,” in C. C. Aggarwal, and P. S. Yu, (eds.) “Privacy-Preserving Data Mining. Advances in Database Systems, vol. 34, pp. 11-52, Springer, Boston, MA, 2008.
  • I. Roy, S. T. Settee, A. Kilmer, V. Shmatikov, and E. Witchel, “Airavat: Security and privacy for MapReduce,” vol. 5, pp. 500-901, 2012.
  • X. Zhang, C. Liu, S. Nepal, W. Dou, and J. Chen, “Privacy-preserving layer over MapReduce on cloud,” 2012 Second International Conference on Cloud and Green Computing, vol. 45, pp. 304-310, 2012.
  • C. Gentry, “Fully homomorphism encryption using ideal lattices,” in Proceedings of the 41st Annual ACM Symposium on Theory of Computing (STOC’09), pp. 169-178, 2013.
  • K. Grolinger, M. Hayes, W. A. Higashino, A. L’Heureux, D. S. Allison, and M. A. M. Carets, “Challenges for MapReduce in big data,” Proc. IEEE 10th 2014 World Conference Services, pp. 1-67, 2014.
  • L. Brankovic, and V. Estivill-Castro, “Privacy issues in knowledge discovery and data mining,” ACM, vol. 24, pp. 89-400, 2016.
  • J. Dean, and S. Ghemawat, “MapReduce: A flexible data processing tool,” Communications of the ACM, vol. 53, no. 1, pp. 72-77, 2014.
  • M. B. Malik, M. A. Ghazi, and R. Ali, “Privacy preserving data mining techniques: Current scenario and future prospects,” vol. 67, pp. 90-330, Springer Berlin Heidelberg, 2016.
  • S. Matwin, “Privacy-preserving data mining techniques: Survey and challenges,” Proc. IEEE, vol. 15, pp. 30-99, 2016.
  • X. Chen, and Q. Huang, “The data protection of MapReduce using homomorphic encryption,” Proc. IEEE Int. Conf. Software Eng. Serv. Sci. ICSESS, vol. 30, pp. 419-421, 2013.
  • N. Cao, C. Wang, M. Li, K. Ran, and W. Lou, “Privacy-preserving multi-keyword ranked search over encrypted cloud data,” in Proceedings of the 31st Annual IEEE International Conference on Computer Communications (INFOCOM’11), pp. 829-837, 2011.
  • K. P. N. Puttaswamy, C. Krueger, and B. Y. Zhao, “Silver line: Toward data confidentiality in storage-intensive cloud applications,” in The 2nd ACM Symposium on Cloud Computing (SoCC’11), Cascais, Portugal, 27-28 October 2011.
  • J. Ball, “NSA’s prism surveillance program: How it works and what it can do, the guardian,” ACM, vol. 34, pp. 89-700, 2016.
  • M. Ben-Or, S. Goldwasser, and A. Wigderson, “Completeness theorems for non-cryptographic fault tolerant distributed computation,” in Proceedings of the 12th Annual ACM Symposium on Theory of Computing, pp. 1-10, 2017.
  • Y.-A. de Montjoye, “The privacy bounds of human mobility,” Proc. IEEE, vol. 89, pp. 89-567, 2017.
  • J. Siesta, and K. Gai, “Protecting the privacy of metadata through safe answers,” in Proceedings of the 11th Annual ACM Symposium, pp. 90-435, 2017.
  • C. Dwork, “Differential privacy, in automata, languages and programming,” in Proc. IEEE, vol. 56, pp. 90-567, 2016.
  • C. Gentry, “Fully homomorphic encryption using ideal lattices,” in Proceedings of 13th Annual ACM Symposium, pp. 67-90, 2017.
  • S. Rathi, and P. K. Bose, “Security issues in peer-to-peer systems,” in Proceedings of ACM, 10th Annual Symposium, pp. 78-90, 2016.
  • L. Sweeney, “k-Anonymity: A model for protecting privacy,” International Journal of Uncertainty, IEEE, vol. 67-90, pp. 45-78, 2017.
  • C. S. Vimercati, “k-Anonymity,” in Secure Data Management in Decentralized Systems, vol. 90, pp. 323-353, Springer US, 2017.
  • K. T. Smith, “Big data security: The evolution of hadoop’s security model,” Proc. IEEE, vol. 67, pp. 89-561, 2016.
  • M. T. Jones, “Hadoop security and sentry,” ACM Symposium, vol. 20, pp. 55-99, 2017.
  • V. L. Voydock, and S. T. Kent, “Security mechanisms in high-level network protocols,” ACM Computation Survey, vol. 12, pp. 700-900, 1983.
  • V. Shukla, “Hadoop security: Today and tomorrow,” Proc. IEEE, vol. 45, pp. 45-90, 2017.
  • M. Satyanarayanan, “Integrating security in a large distributed system,” ACM Transactions on Computation System, vol. 7, no. 3, pp. 247-280, August 1989.
  • S. Narayanan, “Securing hadoop-implement robust end-to-end security for your hadoop ecosystem,” ACM 12th Annual Symposium, pp. 90-123, 2018.
  • S. Singh, and N. Singh, “Big data analytics,” 2012 International Conference on Communication, Information and Computing Technology, Mumbai, India, IEEE, vol. 67, pp. 9-20, October 2011.
  • J. Hurt, “The three vs of big data as applied to conferences,” in P. P. Sharma, et al., International Journal of Computer Science and Information Technologies (IJCSIT), vol. 5, no. 2, pp. 2126-2213, 2014.
  • A. Kudus, C. D. Banerjee, and P. Sara, “Introducing new services in cloud computing environment,” International Journal of Digital Content Technology and its Applications, AICIT, vol. 4, no. 5, pp. 143-152, 2010.
  • L. Wang, J. Tao, M. Kuntz, A. C. Castellanies, D. Kramer, and W. Karl, “Scientific cloud computing: Early definition and experience,” 10th IEEE International Conference on High Performance Computing and Communications, vol. 56, pp. 825-830, Dalian, China, September 2008.
  • R. L. Grossman, “The case for cloud computing,” 30th ACM Symposium, vol. 11, no. 2, pp. 23-27, 2009.
  • B. R. Kandukuri, V. R. Paturi, and A. Rakshit, “Cloud security issues,” in Proceedings of IEEE International Conference on Services Computing, vol. 90, pp. 517-520, 2009.
  • M. Jensen, J. Schwann, N. Gruschka, and L. L. Icons, “On technical security issues in cloud computing,” Proc. of IEEE International Conference on Cloud Computing, pp. 109-116, India, 2009.
  • B. Pring, R. H. Brown, A. Frank, S. Hayward, and L. Leong, “Forecast: Sizing the cloud; understanding the opportunities in cloud services,” Gartner Inc., Tech. Rep. G00166525, in the Proceedings of 50th IEEE Conference, vol. 7, pp. 9-40, March 2009.
  • A. Basho, and Y. B. Dujodwala, “Securing cloud from DoS attacks using intrusion detection system in virtual machine,” in Proceeding of the 2010 Second International Conference on Communication Software and Networks (ICCSN’10), vol. 23, pp. 260-264, IEEE Computer Society, USA, 2010.
  • B. R. Kandukuri, R. V. Paturi, and A. Rakshit, “Cloud Security Issues,” 2009 IEEE International Conference on Services Computing, 21st Proceedings of IEEE, vol. 122, pp. 517-520, Bangalore, India, 21-25 September 2009.
  • K. Hwang, S. Kulkarni, and Y. Hue, “Cloud security with virtualized defense and reputation-based trust management,” Proceedings of 2009 Eighth IEEE International Conference on Dependable, Autonomic and Secure Computing (Security in Cloud Computing), vol. 67, pp. 621-628, Chengdu, China, December 2009.
  • R. Gellman, “Privacy in the clouds: Risks to privacy and confidentiality from cloud computing,” The World Privacy Forum, vol. 56, pp. 8-10, 2009.
  • B. Holman, A. Eriksson, and R. Embark, “What networking of information can do for cloud computing,” The 18th IEEE International Workshops on Enabling Technologies: Infrastructures for Collaborative Enterprises, Groningen, the Netherlands, pp. 78-120, 29 June - 01 July 2009.
  • L. J. Zhang, and Q. Zhou, “CCOA: Cloud Computing Open Architecture,” ICWS 2009: IEEE International Conference on Web Services, pp. 607-616, July 2009.
  • T. Mather, S. Kumaraswamy, and S. Leif, “Cloud Security and Privacy: An Enterprise Perspective on Risks and Compliance, O’ Reilly Media, USA, vol. 20, pp. 67-70, 2009.
  • R. L. Kurtz, and R. D. Vines, Cloud Security a Comprehensive Guide to Secure Cloud Computing, Wiley Publishing, Inc., 2010.
  • K. Vieira, A. Schulte, C. B. West Hall, and C. M. West Hall, “Intrusion detection techniques for grid and cloud computing environment,” IT Professional, IEEE Computer Society, vol. 12, no. 4, pp. 38-43, 2010.
  • M. D. Diasakos, D. Kaisaris, P. Mere, G. Pallid, and A. Vocalic, “Cloud computing: Distributed internet computing for IT and scientific research,” IEEE Internet Computing Journal, vol. 13, no. 5, pp. 10-13, September 2009. DOI: 10.1109/MIC.2009.103.
  • Y. Zhao, and J. Wu, “Dache: A data aware caching for big-data applications using the MapReduce framework,” INFOCOM, 2013 Proceedings IEEE, pp. 35-39, Turin, 14-19 April 2013.
  • X. Li, W. Jiang, Y. Jiang, and Q. Zou, “Hadoop applications in bioinformatics,” Proceedings of the 2012 7th Open Cirrus Summit (OCS), pp. 48-52, Beijing, 9-12 June 2012.
  • E. Bertino, S. Castano, E. Ferrari, and M. Mesta, “Specifying and enforcing access control policies for XML document sources,” World Wide Web, vol. 3, pp. 139-151, 2000.
  • E. Bertino, B. Carminati, E. Ferrari, A. Gupta, and B. Thuraisingham, “Selective and authentic third-party distribution of XML documents,” vol. 8, pp. 1263-1278, 2004.
  • A. Kilzer, I. Roy, S. T. V. Setty, V. Shmatikov, and E. Witchel, “Airavat: Security and privacy for MapReduce,” vol. 34, pp. 30-56, 2013.
  • K. Shirudkar, and D. Motwani, “Securing big data: Security recommendations for hadoop and NoSQL environments,” Proc. IEEE, vol. 78, pp. 23-90, 2017.
  • P. R. Anisha, C. K. K. Reddy, K. S. Reddy, and S. S. Reddy, “Third party data protection applied to cloud and Xacml implementation in the hadoop environment with Sparql,” vol. 89, pp. 39-46, July-August 2012.
  • D. Motwani, “Big data security,” 10th Annual ACM Symposium, pp. 90-234, 2016.
  • L. Bilge, and T. Dmitri’s, Paper presented at the 23rd ACM Conference on Computer and Communications Security, pp. 34-67, 2012.
  • R. Bryant, R. Katz, and E. Lazowska, “Big-data computing: Creating revolutionary breakthroughs in commerce, science and society,” Washington, DC: Computing Community Consortium, vol. 90, pp. 1-20, 2008.
  • R. Toshniwal, and K. G. Dastidar, “Big data security issues and challenges,” Proc. IEEE, vol. 23, pp. 123-789, 2015.
  • A. Kosba, “Blockchain model for cryptography and privacy preservation for smart contracts,” Proc. IEEE, vol. 12, pp. 1-20, 2016.
  • M. Cognoscenti, A. Vetro, and J. C. D. Martin, “Blockchain for the internet of things,” ACM Symposium, pp. 1-45, 2017.
  • “Cloud security in MapReduces.” pp. 90-120, 2010. Available:http://hackedexistence.com/downloads/ Cloud Security in Map Reduce.pdf
  • “Hadoop chi cloudera.” pp. 90-120, 2012. Available: http://www.cloudera.com/content/cloudera/en/productsand-services/cdh.html
  • “Hadoop security today and tomorrow,” p. 78, 2011. Available: http://hortonworks.com/blog/hadoop-security-todayand-tomorrow/, pp. 80-122, 2014.
  • “Hadoop user guide.” pp. 12-30, 2012. Available: https://hadoop.apache.org/docs/r2.4.1/hadoopproject-dist/hadoop-hdfs/HdfsUserGuide.html
  • “HDFS encryption.” pp. 90-140, 2013. Available: http://blog.cloudera.com/blog/2014/06/projectrhinogoal-at-rest-encryption/
  • “Hip sandbox.” p. 90, 2013. Available: http://hortonworks.com/products/hortonworkssandbox
  • “Loopholes in hadoop.” pp. 56-90, 2014. Available: http://readwrite.com/2014/08/13/hadoop-slow- securityissues-still-popular
  • “Security implementation in hadoop.” pp. 67-89, 2012. Available: http://search.iiit.ac.in/cloud/presentations/28.pdf
  • “Taking hadoop security to the next level.” pp. 45-78, 2013. Available: http://www.securityweek.com/bigger-data-smallerproblems-taking-hadoop-security-next-level
  • C. Basescu, A. Carpen-Amarie, C. Leordeanu, A. Costing, and G. Antoniou, “Managing data access on clouds: A generic framework for enforcing security policies,” in 2011 IEEE International Conference on Advanced Information Networking and Applications, pp. 459-466, 2011.
  • B. R. Chang, H. F. Tsai, Z.-Y. Lin, and C.-M. Chen, “Access security on cloud computing implemented in hadoop system,” pp. 77-80, 2016.
  • K. Elmeleegy, C. Olston, and B. Reed, “Spongefiles: Mitigating data skew in MapReduce using distributed memory,” in Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, pp. 551-562, ACM, 2014.
  • V. N. Inukollu, S. Arsi, and S. R. Ravuri,” High level view of cloud security: Issues and solutions,” International Journal of Computer Science and Information Technology, vol. 6, no. 2, 2014.
  • M. Islam, A. K. Huang, M. Batista, M. Chiang, S. Srinivasan, C. Peters, A. Neumann, and A. Abdelnur, “Oozier: Towards a scalable workflow management system for hadoop,” pp. 800-1000, 2017.
  • R. C. Jose, and S. Paul, “Privacy in map reduces based systems: A review,” International Journal of Computer Science and Mobile Computing, vol. 3, no. 2, pp. 463-466, February 2014.
  • H. Y. Lin, S. T. Shin, W. G. Ten, and B. S. Lin, “Toward data confidentiality via integrating hybrid encryption schemes and hadoop distributed file system,” pp. 109-700, 2013.
  • S. Medan, and R. K. Agawam, “Implementation of identity based distributed cloud storage encryption scheme using PHP and C for hadoop file system,” in 2012 5th Romania Tier 2 Federation Grid, Cloud and High Performance Computing Science (RQLCG), pp. 74-77, 2012.
  • O. O’Malley, in Kerberos Conference, pp. 26-27, 2010.
  • O. O’Malley, K. Zhang, S. Radia, R. Marti, and C. Harrell, “Hadoop security design techniques,” pp. 809-1010, 2009.
  • S. Park, and Y. Lee, “Secure hadoop with encrypted HDFS,” in Grid and Pervasive Computing, pp. 134-141, Springer, 2013.
  • Y. Reddy, Access control for sensitive data in hadoop distributed file systems,” in INFOCOMP 2013, The Third International Conference on Advanced Communications and Computation, pp. 72-78, 2013.
  • I. Roy, S. T. Settee, A. Kilmer, V. Shmatikov, and E. Withal, “Airavat: Security and privacy for MapReduce,” in NSDI, vol. 10, pp. 297-312, 2010.
  • G. S. Sadasivam, K. A. Kumari, and S. Rubika, “A novel authentication service for hadoop in cloud environment,” 2012 IEEE International Conference on Cloud Computing in Emerging Markets (CCEM), pp. 1-6, IEEE, 2012.
  • Z. Shin, and Q. Tong, “The security of cloud computing system enabled by trusted computing technology,” in 2010 2nd International Conference on Signal Processing Systems (ICSPS), vol. 2, pp. 1-11, IEEE, 2010.
  • J. G. Steiner, B. C. Neumann, and J. I. S. Kerberos, “An authentication service for open network systems,” in USENIX Winter, pp. 191-202, 1988.
  • S. Sabatini, and V. Kapitsa, “A survey on security issues in service delivery models of cloud computing,” Journal of Network and Computer Applications, vol. 34, no. 1, pp. 1-11, 2011.
  • G. Sujitha, M. Varadharajan, Y. V. Rao, R. Sridev, M. K. S. Gauthaum, S. Narayanan, R. S. Raja, and S. M. Shalinie, “Improving security of parallel algorithm using key encryption technique,” Information Technology Journal, vol. 12, no. 12, pp. 2398-2404, 2013.
  • United Nations, “The Universal Declaration of Human Rights,” pp. 45-700, 2015.
  • A. Westin, Privacy and Freedom, New Jock Athenaeum, Blockchain technology, pp. 78-900, 1967.
  • U. States, Gramm-Leach-Bliley, “Big data discussion on blockchain evolution,” p. 556, 1999.
  • U. S. F. Law, Right to financial, “Blockchain technology for hadoop and big data,” p. 89, 1978.
  • D. Bigo, G. Boulet, C. Bowden, S. Carrera, J. Jeandesboz, and A. Scherrer, “Fighting cyber crime and protecting privacy in the cloud,” European Parliament, Policy Department C: Citizens’ Rights and Constitutional Affairs, October 2012.
  • S. Stalla-Bourdillon, “Liability exemptions wanted internet intermediaries’ liability under UK law,” Journal of International Commercial Law and Technology, vol. 7, no. 4, 2012.
  • N. M. Gonzalez, M. A. T. Rojas, M. M. dab Silva, F. Redigolo, T. M. de B. Carvalho, C. Mires, M. Nyasaland, and A. Ahmed, “A framework for authentication and authorization credentials in cloud computing,” in 2013 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), pp. 509-516, July 2013, International Journal of Network Security and Its Applications (IJNSA), vol. 8, no. 1, January 2016.
  • R. K. Banal, P. Jain, and V. K. Jain, “Multi-factor authentication framework for cloud computing,” in 2013 Fifth International Conference on Computational Intelligence, Modeling and Simulation, pp. 105-110, September 2013.
  • R. K. Lomotey, and R. Deters, “SaaS authentication middleware for mobile consumers of IaaS cloud,” in 2013 IEEE Ninth World Congress on Services, pp. 448-455, June 2013.
  • H. Kim, and S. Tim, “X.509 authentication and authorization in Fermi cloud,” in 2014 IEEE/ACM 7th International Conference on Utility and Cloud Computing (UCC), pp. 732-737, December 2014.
  • B. Tang, R. Sandhog, and Q. Li, “Multi-tenancy authorization models for collaborative cloud services,” in 2013 International Conference on Collaboration Technologies and Systems (CTS), pp. 132-138, May 2013.
  • S. Sharma, U. S. Tim, J. Wong, S. Gadia, R. Shandilya, and S. K. Peddoju, “Classification and comparison of NoSQL big data models,” International Journal of Big Data Intelligence (IJBDI), vol. 2, no. 3, 2015.

Abstract Views: 247

PDF Views: 0




  • A Survey on Big Data Security:Issues, Challenges and Techniques

Abstract Views: 247  |  PDF Views: 0

Authors

Kimmi Kumari
M. S. Ramaiah Institute of Technology, Bangalore, Karnataka., India
M. Mrunalini
Dept. of Computer Science Applications, M. S. Ramaiah Institute of Technology, Bangalore, Karnataka, India

Abstract


As the big data world is increasing day by day, the concern for its security is also becoming one of the major concerns in today’s technology. Advances which took place recently have brought big data in huge demand which have given rise to the data to be outsourced as well as the third party dealing with the business applications. This has also caused the security and the privacy of big data to become a concern factor. This paper provides a thorough and comprehensive review of existing and proposed security and privacy issues in the environment of big data. The work done in this paper leads to the identification of five attributes of security and privacy which are confidentiality, integrity, availability, privacy-preservability, and accountability. In this research paper, the main focus is on the security issues related to big data, hadoop, map reduce framework, HDFS.

Keywords


Big Data, Blockchain, Map Reduce, Privacy, Security.

References