Open Access Open Access  Restricted Access Subscription Access

Data Mining Model Performance of Sales Predictive Algorithms Based on Rapidminer Workflows


Affiliations
1 Dyrecta Lab, IT research Laboratory, via Vescovo Simplicio, 45, 70014 Conversano (BA), Italy
 

By applying RapidMiner workflows has been processed a dataset originated from different data files, and containing information about the sales over three years of a large chain of retail stores. Subsequently, has been constructed a Deep Learning model performing a predictive algorithm suitable for sales forecasting. This model is based on artificial neural network –ANN- algorithm able to learn the model starting from sales historical data and by pre-processing the data. The best built model uses a multilayer neural network together with an “optimized operator” able to find automatically the best parameter setting of the implemented algorithm. In order to prove the best performing predictive model, other machine learning algorithms have been tested. The performance comparison has been performed between Support Vector Machine –SVM-, k-Nearest Neighbor k-NN-,Gradient Boosted Trees, Decision Trees, and Deep Learning algorithms. The comparison of the degree of correlation between real and predicted values, the average absolute error and the relative average error proved that ANN exhibited the best performance. The Gradient Boosted Trees approach represents an alternative approach having the second best performance. The case of study has been developed within the framework of an industry project oriented on the integration of high performance data mining models able to predict sales using–ERP- and customer relationship management –CRM- tools.

Keywords

RapidMiner, Neural Network, Deep Learning, Gradient Boosted Trees, Data Mining Performance, Sales Prediction.
User
Notifications
Font Size

  • Penpece D., & Elma O. E. (2014) “Predicting Sales Revenue by Using Artificial Neural Network in Grocery Retailing Industry: A Case Study in Turkey”, International Journal of Trade Economics and Finance, Vol. 5, No. 5, pp435-440.
  • Thiesing F. M., & Vornberger, O. (1997) “Sales Forecasting Using Neural Networks”, IEEE Proceedings ICNN’97, Houston, Texas, 9-12 June 1997, pp2125-2128.
  • Zhang, G. P. (2003) “Time series forecasting using a hybrid ARIMA and neural network model”, Neurocomputing, Vol. 50, pp159–175.
  • Sharma, A., & Panigrahi, P. K. (2011) “Neural Network based Approach for Predicting Customer Churn in Cellular Network Services”, International Journal of Computer Applications, Vol. 27, No.11, pp0975–8887.
  • Kamakura, W., Mela, C. F., Ansari A., & al. (2005) ” Choice Models and Customer Relationship Management,” Marketing Letters, Vol. 16, No.3/4, pp279–291.
  • Smith, K. A., & Gupta, J. N. D. (2000) “Neural Networks in Business: Techniques and Applications for the Operations Researcher,” Computers & Operations Research, Vol. 27, No. 11–12, pp10231044.
  • Chattopadhyay, M., Dan, P. K., Majumdar, S., & Chakraborty, P. S. (2012) “Application of Artificial Neural Network in Market Segmentation: A Review on Recent Trends,” Management Science Letters, Vol. 2, pp425-438.
  • Berry, J. A. M., & Linoff, G. S. (2004) “Data Mining Techniques For Marketing, Sales, and Customer Relationship Management”, Wiley, Second Edition.
  • Buttle, F. (2009) “Customer Relationship Management Concepts and Technologies”, Elsevier, Second Edition.
  • Thomassey, S. (2014) “Sales Forecasting in Apparel and Fashion Industry: A Review”, Springer, chapter 2.
  • Massaro, A. Barbuzzi, D., Vitti, V., Galiano, A., Aruci, M., Pirlo, G. (2016) “Predictive Sales Analysis According to the Effect of Weather”, Proceeding of the 2nd International Conference on Recent Trends and Applications in Computer Science and Information Technology, Tirana, Albania, November 18 - 19, pp53-55.
  • Parsons, A.G. (2001), “The Association between Daily Weather and Daily Shopping Patterns”, Australasian Marketing Journal, Vol. 9, No. 2, pp78–84.
  • Steele, A.T., (1951) “Weather’s Effect on the Sales of a Department Store”, Journal of Marketing Vol. 15, No. 4, pp436–443.
  • Murray, K. B., Di Muro, F., Finn, A., & Leszczyc, P. P. (2010) “The Effect of Weather on Consumer Spending”, Journal of Retailing and Consumer Services, Vol. 17, No.6, pp512-520.
  • Massaro, A., Galiano, A., Barbuzzi, D., Pellicani, L., Birardi, G., Romagno, D. D., & Frulli, L., (2017) “Joint Activities of Market Basket Analysis and Product Facing for Business Intelligence oriented on Global Distribution Market: examples of data mining applications,” International Journal of Computer Science and Information Technologies, Vol. 8, No.2 , pp178-183.
  • Aguinis, H., Forcum, L. E., & Joo, H. (2013) “Using Market Basket Analysis in Management Research,” Journal of Management, Vol. 39, No. 7, pp1799-1824.
  • Štulec, I, Petljak, K., & Kukor, A. (2016) “The Role of Store Layout and Visual Merchandising in Food Retailing”, European Journal of Economics and Business Studies, Vol. 4, No. 1, pp139-152.
  • Otha, M. & Higuci, Y. (2013) “Study on Design of Supermarket Store Layouts: the Principle of “Sales Magnet””, World Academy of Science, Engieering and Technology, Vol. 7, No. 1, pp209-212.
  • Shallu, & Gupta, S. (2013) “Impact of Promotional Activities on Consumer Buying Behavior: A Study of Cosmetic Industry”, International Journal of Commerce, Business and Management (IJCBM), Vol. 2, No.6, pp379-385.
  • Al Essa, A. & Bach, C. (2014)“ Data Mining and Knowledge Management for Marketing”, International Journal of Innovation and Scientific Research, Vol. 2, No. 2, pp321-328.
  • Kotu, V., & Deshpande B. (2015) “Predictive Analytics and Data Mining- Concepts and Practice with RapidMiner” Elsevier.
  • Wimmer, H., Powell, L. M. (2015) “A Comparison of Open Source Tools for Data Science”, Proceedings of the Conference on Information Systems Applied Research. Wilmington, North Carolina USA.
  • Al-Khoder, A., Harmouch, H., “Evaluating Four Of The most Popular Open Source and Free Data Mining Tools”, International Journal of Academic Scientific Research, Vol. 3, No. 1, pp13-23.
  • Gulli, A., & Pal, S. (2017) “Deep Learning with Keras- Implement neural networks with Keras on Theano and TensorFlow,” Birmingham -Mumbai Packt book, ISBN 978-1-78712-842-2.
  • Kovalev, V., Kalinovsky, A., & Kovalev, S. (2016) “Deep Learning with Theano, Torch, Caffe, TensorFlow, and deeplearning4j: which one is the best in speed and accuracy?” Proceeding of XIII Int. Conf. on Pattern Recognition and Information Processing, 3-5 October, Minsk, Belarus State University, pp99-103.
  • “Walmart Recruiting - Store Sales Forecasting” 2018. [Online]. Available: https://www.kaggle.com/c/walmart-recruiting-store-sales-forecasting/data
  • Huang, H.-C. & Hou, C.-I.. (2017) “Tourism Demand Forecasting Model Using Neural Network”, International Journal of Computer Science & Information Technology (IJCSIT), Vol. 9, No. 2, pp1929.
  • Kalyani, J., Bharathi, H. N., & Rao, J. (2016) “Stock Trend Prediction Using News Sentiment Analysis”, International Journal of Computer Science & Information Technology (IJCSIT), Vol. 8, No. 3, pp67-76.

Abstract Views: 558

PDF Views: 503




  • Data Mining Model Performance of Sales Predictive Algorithms Based on Rapidminer Workflows

Abstract Views: 558  |  PDF Views: 503

Authors

Alessandro Massaro
Dyrecta Lab, IT research Laboratory, via Vescovo Simplicio, 45, 70014 Conversano (BA), Italy
Vincenzo Maritati
Dyrecta Lab, IT research Laboratory, via Vescovo Simplicio, 45, 70014 Conversano (BA), Italy
Angelo Galiano
Dyrecta Lab, IT research Laboratory, via Vescovo Simplicio, 45, 70014 Conversano (BA), Italy

Abstract


By applying RapidMiner workflows has been processed a dataset originated from different data files, and containing information about the sales over three years of a large chain of retail stores. Subsequently, has been constructed a Deep Learning model performing a predictive algorithm suitable for sales forecasting. This model is based on artificial neural network –ANN- algorithm able to learn the model starting from sales historical data and by pre-processing the data. The best built model uses a multilayer neural network together with an “optimized operator” able to find automatically the best parameter setting of the implemented algorithm. In order to prove the best performing predictive model, other machine learning algorithms have been tested. The performance comparison has been performed between Support Vector Machine –SVM-, k-Nearest Neighbor k-NN-,Gradient Boosted Trees, Decision Trees, and Deep Learning algorithms. The comparison of the degree of correlation between real and predicted values, the average absolute error and the relative average error proved that ANN exhibited the best performance. The Gradient Boosted Trees approach represents an alternative approach having the second best performance. The case of study has been developed within the framework of an industry project oriented on the integration of high performance data mining models able to predict sales using–ERP- and customer relationship management –CRM- tools.

Keywords


RapidMiner, Neural Network, Deep Learning, Gradient Boosted Trees, Data Mining Performance, Sales Prediction.

References