Statistical modelling question
Discussion
Foremost I apologise if this is in the wrong place to post this question, I did consider the "finance" section but most of those questions pertained to personal finance.
I'm normally a "people manager" (although really I'm just a subject matter expert for a team) but I've been asked to look at a small fixed rate residential loan portfolio with a consistently very high churn rate. Unfortunatly it's come to light that there had been little consideration given to managing the customer churn aspect of the portfolio which we know is poor by external measures (acquisition is excellent, retention is terrible.) We have a range of historic data - all the usual loan data (i.e. amounts, repayment rates etc.) and a few peripherals such as if early exit fees have previously been calculated, if discounts and incentives where offered previously, if a client has been retained using incentives, monthly reporting on how many customers have refinanced to other providers etc.
Every month we generate a report of fixed rate agreements which will be expiring but they are infrequently contacted. What I wanted to do was look towards building a simple model so that we could isolate residential loans coming due that are at a higher probability refinancing to another financial service provider.
What sort of modelling do you think I should be looking into initially? While I studied a STEM at university it was a few years ago now and I did not take higher level statistics courses so your assistance and/or thoughts would be very much appreciated.
I'm normally a "people manager" (although really I'm just a subject matter expert for a team) but I've been asked to look at a small fixed rate residential loan portfolio with a consistently very high churn rate. Unfortunatly it's come to light that there had been little consideration given to managing the customer churn aspect of the portfolio which we know is poor by external measures (acquisition is excellent, retention is terrible.) We have a range of historic data - all the usual loan data (i.e. amounts, repayment rates etc.) and a few peripherals such as if early exit fees have previously been calculated, if discounts and incentives where offered previously, if a client has been retained using incentives, monthly reporting on how many customers have refinanced to other providers etc.
Every month we generate a report of fixed rate agreements which will be expiring but they are infrequently contacted. What I wanted to do was look towards building a simple model so that we could isolate residential loans coming due that are at a higher probability refinancing to another financial service provider.
What sort of modelling do you think I should be looking into initially? While I studied a STEM at university it was a few years ago now and I did not take higher level statistics courses so your assistance and/or thoughts would be very much appreciated.
You could look at a multivariate analysis (partial least squares, principle component analysis, multivariate curve resolution) where you look to find a model that takes all your input data and predicts the refinancing probability. The value of this depends on whether there is indeed any relation, and how big and accurate your historical data is. Or a machine learning method, such as a neutral network.
Neural networks seem to be 'de rigueur' at the moment. This document seems to give a good overview of stuff that should be relevant:
https://www.cc.gatech.edu/~isbell/tutorials/rbf-in...
The intro highlights some search terms that you could chuck into google scholar (i.e. different names for doing the same thing), and there seem to be some good examples that should help to apply RBF networks to your data.
https://www.cc.gatech.edu/~isbell/tutorials/rbf-in...
The intro highlights some search terms that you could chuck into google scholar (i.e. different names for doing the same thing), and there seem to be some good examples that should help to apply RBF networks to your data.
Logistic regression is probably the best approach as it is well known, it deals with a binary outcome and so is good for churn modelling.
If you work in finance, it’s also the same approach that is commonly used to forecast delinquent loans so you may have some SME’s in the business already.
There are more glamorous techniques, but logistic regression is probably the most understood in this context.
If you work in finance, it’s also the same approach that is commonly used to forecast delinquent loans so you may have some SME’s in the business already.
There are more glamorous techniques, but logistic regression is probably the most understood in this context.
speedy_thrills said:
Logistic regression does look like a good start. After reading a few case-studies it look like kNN is often next step on from that.
Take your pick there is a lot of software out there https://en.wikipedia.org/wiki/List_of_statistical_...Gassing Station | Science! | Top of Page | What's New | My Stuff