Gaining as many good credit scores are beneficial for customers in numerous ways and it also allows banks to analyse their clients and to give credit loans to them accordingly. In this paper, we look whether data mining techniques are useful to predict and classify the customer’s credit score (good/bad) to overcome the future risks giving loans to clients who cannot repay. We use historical given dataset of a bank for our predictive modelling (general models), banks can use them for the better outcome of their overall credit system. For example, if a customer is assigned a bad credit score after applying these predictive classification models, then the bank will not allow giving that customer a future credit and will quickly analyse all the other risky credits.
With today’s emerging development of countries it sometimes becomes perilous for banks to give credit to all of their customers without knowing that will they be able to return it in time or not. Same could be a major concern for all the banks nowadays as the total loss will add up to enormous value if customers will not return credit loans with interest in time, causing bankruptcy.
Jafarpour et al. focused customer relationship management by using the Iranian bank s dataset. Customer relationship management model interprets the relation between customers needs and banks through various channels, devised an
equation that banks and loan firms can use to predict loan customers.
Hsu et al. applied support vector machine(SVM) for classification of bank credit dataset and made a conclusion that SVM accuracy increases with increase in data samples or by applying other selection features which make it more useful in credit rating .
Turkson et al. applied supervised and unsupervised machine learning algorithms to bank credit dataset to predict the credit worthiness, some algorithms determined up to 80% prediction accuracy rate.
Moro et al. examined Portuguese retail bank dataset in telemarketing and adopted strengths of neural network models to predict the success.
Major concern for all the banks nowadays as the total loss will add up to enormous value if customers will not return credit loans with interest in time, causing bankruptcy.
To reduce this failure risk, we have proposed some data mining models. These models will help in identifying customer’s ability of repaying credit loans on time by using credit scoring and classifying them as a ‘Good credit’ customer is having good score and don’t have any faulty or defaulter past credit records or a ‘bad credit’ customer is having bad score and may have faulty past records. Banks will be able to give good credit which will eventually become profitable in their annual revenue.
We proposed K-Means algorithm has a loose relationship to the k-nearest neighbor classifier, a popular machine learning technique for classification that is often confused with k-means because of the k in the name. One can apply the 1-nearest neighbor classifier on the cluster centers obtained by k-means to classify new data into the existing clusters. This is known as nearest centroid classifier. And then we have proposed the Collaborative filtering algorithm used to separate the good and bad customers depends upon their loan status, Annual income and current credit balance.
Hence, K Means and Collaborative filtering algorithm are the topmost good & useful algorithms for classification of non categorical data.
After applying classification data mining techniques, we found out that the best algorithm for risky credit classification is Collaborative filtering algorithm
System : Pentium IV 2.4 GHz.
Hard Disk : 40 GB.
Floppy Drive : 1.44 Mb.
Monitor : 15 VGA Colour.
Mouse : Logitech.
Ram : 512 Mb.
Operating system : Windows XP.
Coding Language : PHP
Data Base : MYSQL
Yu Jin and Yudan Zhu, “ A data-driven approach to predict default risk of loan for online Peer-to-Peer(P2P) lending,” School of Information, Zhejiang University of Finance and Economics, 310018 Hangzhou, China.