Pirkinių krepšelio analizė ir vartotojų pasitenkinimo prognozavimas

Marius Vadeika

Title	Pirkinių krepšelio analizė ir vartotojų pasitenkinimo prognozavimas
Translation of Title	Market basket analysis and prediction of customer satisfaction.
Authors	Vadeika, Marius
Full Text
Pages	63
Keywords [eng]	association rules ; imbalanced classes data set ; decision trees ; machine learning algorithms
Abstract [eng]	Market Basket Analysis is important for companies which sell a wide range of products for product placement, recommendation and marketing purposes. For companies, which aim for long-term relationships with their clients it is important to create an exclusive customer experience. The aim of this study is to find customer behavioral patterns by analyzing customer orders and to predict customer satisfaction by building as accurate as possible models. In this study, two tasks (which use Kaggle competition data) are solved: association rule analysis (Instacart data) and imbalanced classification task (Santander customer satisfaction data). For Market Basket Analysis association rules were discovered with Apriori algorithm by setting support and confidence thresholds. Association rules were found for products and for product categories. It was concluded, that in order to find interesting rules, lower support values and higher lift and conviction values are required. Rule selection takes time, because it‘s not clear in advance what support and confidence thresholds to select. After selecting interesting rules it was concluded that association rules for product categories are more useful for product placement and abstract offers. On the other hand, strong association rules for products can be useful for suggesting specific products to clients. While studying Santander bank customer satisfaction data with an anomaly detection algorithm, it was observed that unsatisfied customers are not very different from regular customers and therefore are hard to predict. Feature selection was done in the data preparation step and 31% of variables were removed without losing classification accuracy. Then, resampling methods were tested, and it was concluded that classification accuracy can be increased by reducing class imbalance. Customer satisfaction was predicted with three popular machine learning algorithms which use decision trees. From the three models a single majority vote model was built which correctly classified 42% of unsatisfied customers.
Dissertation Institution	Kauno technologijos universitetas.
Type	Master thesis
Language	Lithuanian
Publication date	2018

„Pirkinių krepšelio analizė ir vartotojų pasitenkinimo prognozavimas“