Research on the impact of machine learning on the banking industry has been going on for a long time since the 20th century, but there has been no major breakthrough. It was not until the realization of big data technology that got rid of the shackles of computer hardware requirements imposed by traditional data storage and processing technology that machine learning truly played a practical role in banking business, such as credit risk management.
Machine Learning (Marchine Learning) sounds complicated and futuristic, but in fact its working principle is very simple. Simply put, machine learning combines a large number of decision models similar to decision trees to create a more accurate model. By rapidly and iteratively training these decision models, machine learning is able to find “hidden” optimal solutions, especially in unstructured data that is often missed in statistical models.
Long-tail data often appears in bank investment portfolios. Compared with traditional statistical methods, machine learning has a stronger ability to explain long-tail data. We generally know very little about this group of customers, given their small individual investments but large overall numbers, and they are also quite passive in receiving banking services. However, machine learning can conduct a good analysis of the behavior of such customers, thereby guiding business personnel to explore potential profit targets in a targeted manner.
To give an example of the practical application of machine learning in the credit card product line of the banking industry, the bank’s goal is to seek the optimal credit limit for each customer. Simply put, they want to know what aspects they can use. Increase or decrease your credit limit. Although existing statistical models already have considerable predictive capabilities, when machine learning methods are used to retrain the same data set and add some unstructured data such as policy regulations during training, the model's predictive capabilities are significantly reduced. Directly increased by 1.6 times. This improvement enables significant gains from customers who are less risky based on the existing model, resulting in credit limit reductions and avoidance of additional losses due to credit limit increases.
So, what’s stopping the banking industry from adopting machine learning methods more widely? Generally speaking, there are three problems: First, the expansion of the variable size will make the current banking system require more funds for R&D and maintenance; second, many models in machine learning are a black box, so that This leads to the uninterpretability of the prediction results, which seriously violates the stability norms of the banking industry; finally, the test of machine learning accuracy is relatively complex, so the use of machine learning methods poses certain challenges in the verification process.
Although machine learning methods have problems such as these, there are also some practical ways to mitigate these problems. For example, directly use all available variables to start modeling, and quickly filter each variable according to its contribution to the model, thereby leaving a batch of controllable variables without affecting the prediction accuracy of the model; or It is to reduce some "branches" in the machine learning model to obtain a core set of linear rules that use fewer variables but retain more than 80% of the predictive power of the original model.
Can the banking industry use more complex machine learning models to obtain more value? The answer is yes, and it is also a trend in the future development of the banking industry. According to Mr. Liu, a data analyst at CICC Data Business Center, “machine learning can go deeper by utilizing a large amount of “small” data that is ignored by existing models in the banking industry, plus common unstructured data in internal and external regulatory systems. Understand the needs of potential customers and help banks tap more customer value.”