Prediction of the Shoppers Loyalty with Aggregated Data Streams

Vladimir Nikulin 1
  • 1 Department of Mathematical Methods in Economy, Vyatka State University, Kirov, Russia


Consumer brands often offer discounts to attract new shoppers to buy their products. The most valuable customers are those who return after this initial incentive purchase. With enough purchase history, it is possible to predict which shoppers, when presented an offer, will buy a new item. While dealing with Big Data and with data streams in particular, it is a common practice to summarize or aggregate customers’ transaction history to the periods of few months. As an outcome, we compress the given huge volume of data, and transfer the data stream to the standard rectangular format. Consequently, we can explore a variety of practically or theoretically motivated tasks. For example, we can rank the given field of customers in accordance to their loyalty or intension to repurchase in the near future. This objective has very important practical application. It leads to preferential treatment of the right customers. We tested our model (with competitive results) online during Kaggle-based Acquire Valued Shoppers Challenge in 2014.

If the inline PDF is not rendering correctly, you can download the PDF file here.

  • [1] A. Agarwal, O. Chapelle, M. Dudik and J. Langford. A Reliable Effective Terascale Linear Learning System. Journal of Machine Learning Research, 15, 2014, pp. 1111 - 1133.

  • [2] V. Bhambri. Data Mining as a Tool to Predict Churn Behaviour of Customers. International Journal of Management Research, April 2013, pp. 59 - 69.

  • [3] P. Domingos and G. Hulten. Mining high-speed data streams. KDD 2000, pp. 71 - 80.

  • [4] P. Dhandayudam and I. Krishnamurthi. Customer Behavior Analysis Using Rough Set Approach. Journal of Theoretical and Applied Electronic Commerce Research. ISSN 0718-1876 Electronic Version, Universidad de Talca - Chile, Vol. 8(2), 2013, pp. 21-33.

  • [5] R. East, P. Gendall, K. Hammond and W. Lomax. Consumer Loyalty: Singular, Additive or Interactive? Australasian Marketing Journal, 13(2), 2005, pp. 10 - 26.

  • [6] A. Karahoca, D. Karahoca and N. Aydin. Benchmarking the Data Mining Algorithms with Adaptive Neuro-Fuzzy Inference System in GSM Churn Management. Data Mining and Knowledge Discovery in Real Life Applications, Book edited by: Julio Ponce and Adem Karahoca, 2009, Vienna, Austria, pp. 229 - 242.

  • [7] C.-S. Lin, G.-H. Tzeng, Y.-C. Chin. Combined rough set theory and flow network graph to predict customer churn in credit card accounts. Expert Systems with Applications, 38, 2011, pp. 8 - 15.

  • [8] H. McMahan, G. Holt, D. Sculley, M. Young, D. Ebner, J. Grady, L. Nie, T. Phillips, E. Davydov, D.Golovin, S. Chikkerur, D. LiuM. Wattenberg, A.Hrafnkelsson, T. Boulos, J. Kubica, Ad Click Prediction: a View from the Trenches, KDD ’13 Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, Pp. 1222-1230, ACM New York, NY, USA 2013

  • [9] V. Migueis, D. den Poel, A. Camanho, J. Cunha. Modeling partial customer churn: On the value of first product-category purchase sequences. Expert Systems with Applications, 39, 2012, pp. 11250 -11256.

  • [10] T. Mirzaei and L. Lyer. Application of Predictive Analytics in Customer Relationship Management: a Literature Review and Classification. Proceedings of the Southern Association for Information Systems Conference, Macon, GA, USA March 21st - 22nd, 2014.

  • [11] N. Hashmi, N. Butt and M. Iqbal. Customer Churn Prediction in Telecommunication A Decade Review and Classification. International Journal of Computer Science Vol. 10(5), 2013.

  • [12] W. Hu. Developing pertubation rate of the rough set theory to evaluate the electronic transaction quality of on-line shopping. Pakistan Journal of Statistics, 2012 Vol. 28(5), pp. 581-596.

  • [13] J. Liou, G.-H. Tzeng. A Dominance-based Rough Set Approach to customer behavior in the airline market . Information Sciences, 2010, 180, pp. 2230-2238.

  • [14] S. Neslin, S. Gupta, W. Kamakura, J. Lu, and C. Mason. Defection Detection: Measuring and Understanding the Predictive Accuracy of Customer Churn Models. Journal of Marketing Research, 43, 2006, pp. 204 - 211.

  • [15] E.W.T. Ngai, L. Xiu, D.C.K. Chau. Application of data mining techniques in customer relationship management: a literature review and classification. Expert Systems with Applications, 36, 2009, pp. 2592 - 2602.

  • [16] V. Nikulin. Classification of Imbalanced Data with Random Sets and Mean-Variance Filtering. International Journal of Data Warehousing and Mining, 4(2), 2008, pp. 63 - 78.

  • [17] V. Nikulin, A. Bakharia and T.-H. Huang. On the Evaluation of the Homogeneous Ensembles with CV-passports. LNCS 7867, Springer, J.Li et al. (Eds.), PAKDD 2013 Workshops, pp. 109 - 120.

  • [18] V. Nikulin. Hybrid Recommender System for Prediction of the Yelp Users Preferences. ICDM 2014, St.Petersburg, Russia. LNAI 8557, Springer, P. Perner (Eds.), pp. 85 - 99.

  • [19] B. Pal, R. Sinha, A. Saha, P. Jaumann and S. Misra. Customer Targeting Framework: Scalable Repeat Purchase Scoring Algorithm for Large Databases. Proceedings of 2012 4th International Conference on Machine Learning and Computing IPCSIT, vol. 25, 2012 IACSIT Press, Singapore, pp. 143 - 146.

  • [20] Z. Pawlak. Rough set. International Journal of Computer and Information Sciences, 11(1), 1982, pp. 341 - 356.

  • [21] K. Coussement, D. Van den Poel. Improving customer attrition prediction by integrating emotions from client/company interaction emails and evaluating multiple classifiers. Expert Systems with Applications, 36, 2009, pp. 6127 - 6134.

  • [22] A. Sharma and P. K. Panigrahi. A Neural Network based Approach for Predicting Customer Churn in Cellular Network Services. International Journal of Computer Applications, 27(11), 2011, pp. 26 - 31.

  • [23] Y. Xie, Xiu Li, E.W.T. Ngai, W. Ying. Customer churn prediction using improved balanced random forests. Expert Systems with Application, 36, 2009, pp. 5445 - 5449.


Journal + Issues