Open Access
Subscription Access
Open Access
Subscription Access
Value Proposition and ETL Process in Big Data Environment
Subscribe/Renew Journal
For any retail company, managing inventory is of prime importance. Every store should have enough items so that it can fulfill the demand. To achieve this, the stores must be restocked before those items become out of stock. For restocking, the items must arrive from a fulfillment center which distributes the items to various stores, also called distribution centers. Since, distribution center and fulfillment centers are generally far from each other, there is a delay between request for restock and the time it takes for the item to reach from fulfillment centers to distribution centers. To prevent out of stock conditions, the request should be made by considering the time it takes for an item to arrive from fulfillment center. The quantity of item also determines the request time as only few quantities of large items can be sent at once and need multiple transits to restock to the required numbers. Along with these, there are other conditions like general traffic, seasonal climate variations, etc. that can affect the transit time of items. All of these conditions must be taken care while deciding when the item is requested. The proposed system decides the request time and quantity of items along with different variations by training from years of data. This allows the system to work more efficiently and prevent the out of stock conditions to increase sales of the company.
Keywords
Big Data, ETL Process, HDFS, SparkML, SparkSQL, Value Proposition.
Subscription
Login to verify subscription
User
Font Size
Information
- M. Bowman, S. K. Debray, and L. L. Peterson. “Reasoning about naming systems,” ACM Transactions on Programming Languages and Systems (TOPLAS), vol. 15, no. 5, pp. 795-825, 1993.
- M. Armbrust, R. S. Xin, C. Lian, Y. Huai, D. Liu, J. K. Bradley, X. Meng, T. Kaftan, M. J. Franklin, A. Ghodsi, and M. Zaharia, “Spark SQL: Relational data processing in spark,” Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, SIGMOD’15, pp. 1383-1394, 2015.
- R. Kimball, and J. Caserta, The Data Warehouse ETL Toolkit: Practical Techniques for Extracting Cleaning Conforming and Delivering Data, Wiley Publishing, Inc., 2017.
- R. Kimball, and M. Ross, The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling, 3rd ed., Jonh Wiley & Sons, Inc., 2017.
- D. M. Tank, A. Ganatra, Y. P. Kosta, and C. K. Bhensdadia, “Speeding ETL processing in data warehouses using high-performance joins for Changed Data Capture (CDC),” pp. 365-368, October 2017.
- G. Forman, “An extensive empirical study of feature selection metrics for text classification,” Journal of Machine Learning Research, vol. 3, pp. 1289-1305, March 2003.
- I. Mekterovic, and L. Brkic, “Delta view generation for incremental loading of large dimensions in a data warehouse,” 2016 38th International Convention on Information and Communication Technology Electronics and Microelectronics (MIPRO), pp. 1417-1422, May 2016.
- K. Shvachko, H. Kuang, S. Radia, and R. Chansler, “The hadoop distributed file system,” 2017 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), pp. 1-10, May 2017.
- T. Hey, S. Tansley, and K. Tolle, The Fourth Paradigm: Data Incentive Scientific Discovery, Microsoft Corporation, 2009.
Abstract Views: 301
PDF Views: 1