Early identification of high-risk credit card customers based on behavioral data
Abstract
Credit card banking has for a long time been one of the most profitable types of banking.The largest cost for credit card companies is customers not paying their debt. Consequently,to accurately model the risk a customer poses can provide large savings for creditcard companies.This thesis aims to determine if it is possible to identify high risk credit card customerswithin the first months of the customer relationship. Using a credit card dataset consistingof customers first 18 months of data from between January 2013 and April 2017, machinelearning methods are used to develop classifiers that try to predict future delinquency.Where previous work has incorporated many months of data to predict delinquency, weuse only data from the first and second month of the customer relationship to do the same.Through a number of experiments, several models are developed. In addition to predictingdelinquency, the models are used to analyze behavior driving delinquency and tomodel credit risk.We find that the models can not accurately identify high risk customers based on only afew months of data. The models developed reveal that the factors driving delinquency aremostly intuitive. Using the developed models to predict the probability of delinquencies,we find a strong correlation between the predicted probabilities and realized frequenciesof delinquency.