Open Access
Subscription Access
Cyber Infrastructure as a Service to Empower Multidisciplinary, Data-Driven Scientific Research
In supporting its large scale, multidisciplinary scientific research efforts across all the university campuses and by the research personnel spread over literally every corner of the state, the state of Nevada needs to build and leverage its own Cyber infrastructure. Following the well-established as-a-service model, this state-wide Cyber infrastructure that consists of data acquisition, data storage, advanced instruments, visualization, computing and information processing systems, and people, all seamlessly linked together through a high-speed network, is designed and operated to deliver the benefits of Cyber infrastructure-as-a- Service (CaaS).There are three major service groups in this CaaS, namely (i) supporting infrastructural services that comprise sensors, computing/storage/networking hardware, operating system, management tools, virtualization and message passing interface (MPI); (ii) data transmission and storage services that provide connectivity to various big data sources, as well as cached and stored datasets in a distributed storage backend; and (iii) processing and visualization services that provide user access to rich processing and visualization tools and packages essential to various scientific research workflows. Built on commodity hardware and open source software packages, the Southern Nevada Research Cloud(SNRC)and a data repository in a separate location constitute a low cost solution to deliver all these services around CaaS. The service-oriented architecture and implementation of the SNRC are geared to encapsulate as much detail of big data processing and cloud computing as possible away from end users; rather scientists only need to learn and access an interactive web-based interface to conduct their collaborative, multidisciplinary, data-intensive research. The capability and easy-to-use features of the SNRC are demonstrated through a use case that attempts to derive a solar radiation model from a large data set by regression analysis.
Keywords
Cyber Infrastructure-As-A-Service, Cloud Computing, Big Data, Map Reduce, Data-Driven Scientific Research.
User
Font Size
Information
- Nvsolarnexus.org, “The Solar-Energy-Water-Environment Nexus Project,” [Online]. Available: http://nvsolarnexus.org.
- S. Dascalu, F. C. Harris Jr, M. McMahon Jr, E. Fritzinger, S. Strachan, and R. Kelley, “An Overview of the Nevada Climate Change Portal,” Proc. 7th International Congress on Environmental Modelling and Software (iEMSs), 2014, vol. 1, no. 2014, pp. 75–82.
- V. D. Le, M. M. Neff, R. V Stewart, R. Kelley, E. Fritzinger, S. M. Dascalu, and F. C. Harris, “Microservice-based architecture for the NRDC,” Proc. IEEE International Conference on Industrial Informatics(INDIN), 2015, pp. 1659–1664.
- G. Foundation Inc, “Gentoo Linux Project,” [Online]. Available: http//www. gentoo. org.
- A. Chuvakin, “Linux Kernel Hardening,” [Online]. Available: http://www.symantec.com/connect/articles/linux-kernel-hardening.
- Z. Xianyi, W. Qian, and Z. Chothia, “OpenBLAS” [Online]. Available: http//xianyi. github.io/OpenBLAS.
- S. Browne, J. Dongarra, E. Grosse, and T. Rowan, “The Netlib mathematical software repository,” D-Lib Magazine, vol. 1, no. 3 Sep. 1995.
- A. Kivity, Y. Kamay, D. Laor, U. Lublin, and A. Liguori, “KVM: the Linux Virtual Machine Monitor,” Proc. Linux Symposium, 2007, vol. 1, pp. 225–230.
- O. Sefraoui, M. Aissaoui, and M. Eleuldj, “OpenStack: Toward an Open-Source Solution for Cloud Computing,” International Journal of Computer Applications, vol. 55, no. 3, pp. 38–42, Oct. 2012.
- B. Pfaff, J. Pettit, T. Koponen, E. Jackson, A. Zhou, J. Rajahalme, J. Gross, A. Wang, J. Stringer, P. Shelar, and others, “The design and implementation of open vswitch,” Proc. 12th USENIX Symposium on Networked Systems Design and Implementation (NSDI), 2015, pp. 117–130.
- D. Farinacci, T. Li, S. Hanks, D. Meyer, and P. Traina, “Generic routing encapsulation (GRE),” RFC 2784, Mar. 2000.
- J. Dean and S. Ghemawat, “MapReduce: simplified data processing on large clusters,” Communications of the ACM, vol. 51, no.1, pp. 1–13,Jan, 2008.
- K. Shvachko, H. Kuang, S. Radia, and R. Chansler, “The Hadoop distributed file system,” Proc. 26th IEEE Symposium on Massive Storage Systems and Technologies, 2010.
- A. Rogers and K. Pingali, "Process decomposition through locality of reference," ACM SIGPLAN Notices, vol. 24, no. 7, pp. 69–80, Jul. 1989.
- M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M. J. Franklin, S. Shenker, and I. Stoica, “Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing,” Proc. 9th USENIX conference on Networked Systems Design and Implementation, 2012, pp. 2-2.
- Shengsheng Huang, Jie Huang, Jinquan Dai, Tao Xie, and Bo Huang, “The HiBench benchmark suite: characterization of the MapReduce-based data analysis,” Proc. 26th International Conference on Data Engineering Workshops, Mar. 2010.
- http://pandas.pydata.org/pandas-docs
Abstract Views: 342
PDF Views: 154