Predicting E-commerce Consumer Behaviour Using Sparse Session Data
MetadataShow full item record
This thesis research consumer behavior in an e-commerce domain by using a data set of sparse session data collected from an anonymous European e-commerce site. The goal is to predict whether a consumer session results in a purchase, and if so, which items are purchased. The data is supplied by the ACM Recommender System Challenge, which is a yearly challenge held by the ACM Recommender System Conference. Classification is used for predicting whether or not a session made a purchase, as well as what items it bought. Several characteristics of the data are analysed in order to discover what separates a buy-session from the rest. In addition the interactions with items will be analysed to see what items a given buy-session is likely to purchase. The data is on a rather general format containing only a session ID, an ID of the item interacted with, a timestamp, and a category of the object - meaning the analysis can be applicable to other e-commerce sites and domains. Observations from the analysis are used for extracting features and to provide other valuable information for the classification. The following algorithms for classification are evaluated: Random Forest, Logistic Regression, Decision Tree, Bayesian Network and Naive Bayes. It is shown that one can predict a session's behaviour by using classification. Which items the session interacted with and when the interaction occurred proved to be important factors. The findings may contribute towards improving implicit ratings in recommender systems, or provide useful information for recommender systems when only session data is available.