Microfinance and Machine Learning: A Study of Loan Classification and Risk Management

Wolfson, Ben [Browse]
Senior thesis


Wang, Mengdi [Browse]
Princeton University. Department of Operations Research and Financial Engineering [Browse]
Class year
Summary note
The purpose of this paper is to analyze microfinancial data using machine learning techniques. We seek to understand whether modern data analysis is a useful method to apply to microfinancial data, which is frequently sparse and varied. We examine two data sets, the Lending Club dataset of microfinance loans in the United States from 2013-2016 and a dataset from FINCA Georgia. Specifically, we attempt to classify data into defaulted and paid loans. We find that Random Forests Classifiers to be the most useful and that lexical analysis can also prove helpful in classifying loans. We also find that there are idiosyncrasies in the different data sets that explain the variety of classifier recommendations in the literature. Finally, we conclude that using machine learning on microfinance can be useful for riskier loan detection.

Supplementary Information