Skip to main content
HomeBlogArtificial Intelligence (AI)

AI in Banking: How AI Is Transforming the Banking Industry

We take a grounded look at the applications of AI in the banking industry: credit risk modeling, fraud detection, customer churn, and customer service bots. Discussing the challenges faced in these areas reveals why the progress of AI in this industry may be slowed.
Jun 2024  · 16 min read

“1 billion euros?” I had to make sure I heard that correctly.

I was blown away when I heard the amount the credit risk models would automatically sanction in a year—the models that I would be working on! I was even more shocked when I realized the AI systems managing this money were all powered by simple regression models. This taught me a valuable lesson—the optimal solution is not always the most complex.

Working for a large bank in Ireland, I saw many ways in which AI is applied. Besides sanctioning loans, they include fraud detection, churn modeling, and customer service bots. We’ll talk about all of these applications and the challenges they face.

These challenges, including regulation, model interpretability, ethical considerations, and resistance to change, mean the industry often favors simple solutions. Still, we’ll see there are applications where more advanced AI methods are needed. We end by discussing how some of these may be used in the future.

Elevate Your Finance Team's Data Skills

Train your finance team with DataCamp for Business. Comprehensive data and AI training resources and detailed performance insights to support your goals.

Request a Demo Today!
homepage-hero.png

AI Applications in Banking

From calculating capital reserves to evaluating marketing strategies, statistical analysis informs nearly every decision at a bank. Many people, including data scientists, analysts, managers, business leads, and regulators, will often need to look at the analysis or reports based on it.

Machine learning and statistical models' predictions can be an important part of this analysis. However, using them in this way would not be considered AI.

AI is when a model is used to automate a decision. Its predictions are fed into a system. The system contains rules based on the prediction and other variables that lead to a decision. We call these “AI systems.”

The team I worked on was primarily involved in developing these to automatically sanction new loans. The most important model driving these systems is called a credit risk model.

Credit Risk Assessment

When you request a loan from a bank, you’ll undergo a credit risk assessment. This can involve calculating the expected loss in the case of a default or the probability that you'll default—that is, the probability you won't be able to repay the loan due to financial hardship. It’s this latter prediction that usually drives the decision to sanction (give you) a loan.

How we model credit risk

Since before the invention of computers, banks have been doing credit risk assessments. This was primarily done using a “credit scorecard.”

As seen below, it’ll consider various features of a potential borrower. You’ll get a score based on what group you fall into for a given feature, and they are all summed up to get a final score. The higher your score, the less likely you are to default.

Example of a credit scorecard. It shows the groups scores for two features. The features are annual income and debt-to-income ratio.

Example of a credit scorecard (source: author)

In the past, these were primarily created through “expert judgment.” This is another way of saying some person in a suit decided on the values. And, yes, this was as biased as it sounds. Now, banks have access to vast amounts of data and computational power to create statistically optimal scorecards. As we discuss later, this can also produce fairer results.

In terms of data, there is a lot to work with. Internally, a bank has access to all of your transaction history and past debt behavior. Externally, most countries will have public records on past debts and defaults across all credit institutions.

All of these sources can be used to make model features. Some banks may even go as far as using device data like your geolocation. Although, at that point, they may start running into some regulatory and ethical considerations.

The process of building these scorecards has changed a lot, but the final product looks the same. Model features are discretized. This is done using the Weight of Evidence (WOE) to maximize the difference in default rates with each feature group. We can then assign a score based on how much a group changes the probability of default. So, just as before, the final score can be used to assess how likely the borrower is to default on a loan.

How scorecards are used

These credit scorecards form the basis of an AI system used to automatically sanction loans. In its simplest form, it’s a strict cutoff. If the score is below this cutoff, then the loan is denied.

Additional rules can also be based on the bank’s strategy, regulation, or other features like the customer's age. As we discuss in the next section, more complex systems can use predictions from multiple models.

In practice, these AI systems are used to sanction personal or small business loans. Larger amounts like mortgages or corporate loans are not fully automated. These cases can be too unique or the amount of money too large to sanction without a human making the final decision. Often the decision will still be made with the help of a scorecard.

With all the advancements in data collection and automation, you may expect the underlying models to be complex. You’d be wrong.

Logistic regression is used to calculate probabilities of default and create scorecards. We discuss a few reasons for this in the challenges section. They have to do with the interpretability of the model, regulation, and fairness. However, the main reason is that logistic regression is accurate enough.

The default we’ve spoken about occurs when someone comes into financial trouble due to poor decisions or events outside of their control. The important point is that these people do not want to default. This means underlying relationships that drive credit risk move slowly. This gives us time to craft linear features that capture these relationships in an interpretable way. However, there are other types of defaults which are not as simple.

Fraud Detection and Prevention

In medicine, machine learning predicts complex things, such as whether a tumor is benign or malignant. Now, imagine if the tumor was conscious and could change its appearance to avoid detection. This suddenly becomes a much harder problem to solve. Similarly, this is why fraud detection is so hard.

There are many types of fraud. In the context of automated loan sanctioning, fraud occurs when the borrower never intended to make repayments. Their goal is to get the loan and disappear with the money. The borrower could have overstated their monthly income, hidden the fact that they have other debt to repay, or even stolen someone’s identity to make the application. Whatever way they manage to do it, we would consider this default due to fraud.

A list of 8 types of financial fraud.

Types of fraud (source: author)

How we predict fraud

The nature of credit risk default means we can aggregate transactions to get a general sense of the person's behavior. For fraud, individual transactions may be important. We also look at a more diverse set of data sources like device data and communication history.

Another consideration is the shifting nature of fraud means we need to get the models out faster. This all means linear models are often not up to the challenge.

This is why it’s common to use non-linear models like Random Forests and XGBoost. Neural networks can also help if we want to incorporate text data. With these models, less emphasis is put on feature engineering, saving valuable time and allowing us to model behaviors that aren’t completely understood.

The downside is that these are all still predictive models. We want to catch new cases of fraud before they happen. With predictive models, we need a training dataset of labeled fraud cases. In other words, we first need to wait for fraud to happen!

This is why we can also assess potential fraud cases using unsupervised methods like clustering algorithms or single-class models like Isolation Forest. These can help identify customers with unique behaviors or, in more general terms, outliers.

Adversarial machine learning

With fraud, we also have to worry about attackers targeting the models themselves. This is why fraud models overlap with an important branch of artificial intelligence: adversarial machine learning.

This field aims to find hidden weaknesses in models that can be exploited by attackers. Someone could trick the model into making incorrect predictions or giving away sensitive information. Poisoning attacks involve injecting fake data into the training process to corrupt models without us knowing.

In terms of loan automation, the former is the most prominent. Essentially, a fraudster will try to make themselves look like a good customer. They will aim to find ways to decrease their perceived credit risk. This could be by artificially increasing their income or lying on their application. Fraud models are used to cover these potential weaknesses in credit risk models. Yet, they themselves must also be assessed for potential adversarial attacks.

Ultimately, the many types and nature of fraud mean we often have to rely on more complex models. In some cases, these models can flag potential cases of fraud, which can then be more thoroughly reviewed by a human.

Regarding AI, fraud models will often run alongside credit risk models in systems used to sanction loans automatically. They complement each other as they aim to predict default for different reasons.

Both fraud detection and credit risk assessment are about acquiring new customers. The right customers. Once they have them, banks face another problem. How do they keep them? The next two sections will explore some of the ways AI can help.

Customer Churn

Banks use similar AI-powered strategies to predict if a customer will leave. This could mean closing their bank account or changing mortgage providers. They could also reduce their business without leaving by removing an overdraft, cancelling a credit card, or transferring a large amount of funds to a different bank.

Banks want to step in before these events happen. They will try to persuade dissatisfied customers to use lower interest rates, discounts, or other special benefits. Much of this can be a fully automated process.

Banks use models to predict if a person or business will likely miss future payments (i.e., go into arrears) or default completely. Again, the bank will want to step in before this happens. They may offer financial advice or the opportunity to restructure their debt. Even a simple automated message reminding them of future payments can be effective.

Customer Service and Chatbots

Another area where automation can help is when customers actively seek support from the bank. They may need advice on setting up a savings account or making online payments. Many of these queries will be repetitive, and an AI-powered chatbot can handle them. It can provide answers faster and leave the more complex tasks to humans.

This is also an area where the most advanced AI can be applied. Many companies are updating their chatbots to use LLMs and generative AI. These powerful models have supercharged a bank's chatbot, allowing it to handle more complex queries and even provide personal advice. However, we should be wary of the latter. This has to do with the highly regulated environment in which banks operate.

Challenges of AI in Banking

The real challenges faced by AI in banking are often not technical. Problems around data management and model development have largely been solved. The issues facing the adoption of advanced AI methods come from the environment in which banks operate, established procedures, and a low tolerance to risk.

Regulation, explainability, and transparency

Since the 2008 financial crisis, well, let's say banks are not given as much leeway to make their own decisions. They face a much stricter regulatory environment which determines the amount they need to keep in capital reserves, the level of risk they can take on and even the types of technology they can implement. The latter means that only certain types of models can be applied to some tasks. For example, capital reserves may have to be calculated with a linear model and predefined features.

There is often a need to clarify and justify the process for making lending decisions. This means explaining how models make predictions to a regulator. Although this does not necessarily restrict the type of models we can use, it does provide some resistance. As mentioned, this is one of the reasons why linear models are used for credit risk modeling. Any improvement in performance from using non-linear models is outweighed by the increased burden of explaining them.

Consider Cynthia Rudin's insights on this topic:

Rather than trying to create models that are inherently interpretable, there has been a recent explosion of work on “Explainable ML,” where a second (posthoc) model is created to explain the first black box model. This is problematic. Explanations are often not reliable and can be misleading, as we discuss below. If we instead use models that are inherently interpretable, they provide their own explanations, which are faithful to what the model actually computes.

Cynthia RudinDuke University, Durham, NC, USA

There are other applications where it does make sense to use more advanced methods. Customer churn models and customer service bots can face less regulatory scrutiny.

For fraud prediction, the increased performance provides enough motion to do the work involved with explaining black box models. In this case, you'll have to use explainable AI methods like SHAP to explain your models. You'll likely also have to explain the explainable AI method itself.

If you want to learn more about SHAP and explainability, check out this introduction to SHAP values and this article on explainable AI.

Data privacy and security

Banks have access to intimate details of your life. Imagine if someone got hold of every transaction you ever made. Every store you visited.

Security needs to be tight to avoid data breaches. It’s also one of the main reasons banks would avoid new technology like LLMs. Often using these models requires sending data to a third party via their API. The security risks of this simply outweigh the potential benefits.

At the same time, banks need to be conscious of how they use their data. The regulation we’ve discussed until now has focused on credit risk. Banks can face other regulations, like GDPR, which restrict what data they collect and how they can use it.

Additionally, the EU AI Act will restrict how AI can be used. This can impact the other applications we discussed. For example, they may not be allowed to use transaction data to market new products to you.

Ethical considerations

The decisions made by models in banking can have serious consequences. You could be denied a mortgage or a business may not get a loan it desperately needs to survive.

It’s crucial that these decisions are made fairly and that they should not discriminate against a certain group of people. Depending on the bank, this may be required by legislation or driven by its desire to protect its reputation.

Automation has the potential to reduce unfairness. By centralizing lending decisions and making them more transparent, we can mitigate the impact of individual human biases. It’s therefore even more important that the models powering those decisions are not unfair. One way to ensure this is using an algorithm fairness analysis. This evaluates model performance based on protected variables like gender, race and country of origin.

Another approach to fairness commonly adopted by banks is allowing the customer to challenge an automated decision. They should also be allowed to request feedback on why a decision was made. This means that along with auto accept and auto decline, systems will have a third option—referred to lender. A human would assess why an automated decision was made and correct unfair decisions.

This is another reason simple linear models may be preferred in all applications. As mentioned, these are intrinsically interpretable. Being easier to understand means they can be critically analyzed by a wider group of people. Any issues that can lead to potential bias can be identified before a model is deployed.

Resistance to change

Regulation, security concerns, and ethical considerations all present roadblocks to adopting advanced AI methods. Yet, the biggest obstacle can be the bank itself.

Banks, particularly large established banks, are very risk-averse. They have established ways of building models and scorecards. They even use the same historical format!

Any deviation from these methods would require a significant justification, and you'll need to convince many layers of governance. In short, big banks move slowly.

This is why start-ups and fintech companies are better positioned to adopt these methods. They don’t have procedures in place and are usually less risk-averse. However, they lack the large amounts of capital of big banks and the workforce needed to comply with the regulations we’ve spoken about. Over time, smaller players may be able to chip away at these advantages through the increased value that advanced AI brings.

The Future of AI in Banking

When we look to the future of AI in Banking, we must recognize that advanced methods will have a significant role to play.

Generative AI is of particular importance. Although it may be risky to apply these models in customer-facing applications, they can be useful for developing products and models that are customer-facing. They are not used to predict credit risk but to help build credit risk models. This is particularly true when we consider the routine nature of the models we expect to build in banking.

Many banks will take this a step further. Banks, like all organizations, have a lot of internal documentation. This can include all corporate governance, historical analysis, and modeling procedures. RAG systems, powered by LLMs, can turn this treasure trove of information into a searchable database and internal chatbot. You can think of it as a helpful colleague who knows everything about the banks and is eager to help you get up to speed.

Some companies may even stomach the risk involved with using LLMs in customer-facing products. These could automate things like financial advice to customers based on their banking data. However, hallucinations coupled with the seriousness of incorrect advice will likely prove too risky for established banks. At least with the current level of LLMs.

Conclusion

As a data scientist working in the banking industry, it’s vital to learn continuously but also to think critically about how new technology can be applied. We discussed many potential applications, including credit risk, fraud, and customer churn.

It’s important to be aware of new technology as Advanced AI methods can be applied to some of these areas. However, often you'll find that a strong understanding of the fundamentals of data science and machine learning is more important.

As the industry matures, we'll also need to take a more holistic view of AI. Responsible AI skills are critical in a world where machine learning is facing increased scrutiny and legislation. We need to understand how to interpret models, ensure they are fair and that sensitive data is protected. Many of these are new challenges banks face, and we'll need to adapt if we want to adopt AI.

If you want to learn more about responsible AI, check out this course on AI ethics.

Elevate Your Team's AI Skills

Transform your business by empowering your team with advanced AI skills through DataCamp for Business. Achieve better insights and efficiency.

homepage-hero.png

Photo of Conor O'Sullivan
Author
Conor O'Sullivan

Conor is doing a PhD in machine learning for coastal monitoring. He also has experience working as a data scientist in the banking and semiconductor manufacturing industry.

Topics

Learn more about AI!

track

AI Business Fundamentals

11hrs hours
Accelerate your AI journey, conquer ChatGPT, and develop a comprehensive Artificial Intelligence strategy.
See DetailsRight Arrow
Start Course
See MoreRight Arrow
Related

blog

5 obstacles standing in the way of AI adoption in financial services

The financial services industry is rife with data and opportunities for generating value with data science and AI. However, increasing AI adoption in financial services requires overcoming five key obstacles. Learn more in this blog post.
DataCamp Team's photo

DataCamp Team

5 min

blog

25 Practical Examples of AI Transforming Industries

From reshaping healthcare and e-commerce to revolutionizing agriculture and finance, discover real-world examples of AI driving growth, efficiency, and innovation.

Nahla Davies

16 min

blog

Building trust in AI to accelerate its adoption

Building trust in AI is key towards accelerating the adoption of data science and machine learning in financial services. Learn how to accelerate the development of trusted AI within the industry and beyond.
DataCamp Team's photo

DataCamp Team

5 min

blog

AI in Finance: Revolutionizing the Future of Financial Management

Explore how AI's revolutionary impact on finance, from automated tasks to enhanced decision-making, reshapes risk assessment and investment strategies.
 Shawn Plummer's photo

Shawn Plummer

8 min

Fintech Concept Illustration

blog

Data Science in Banking: Fraud Detection

Learn how data science is implemented in the banking sector by exploring one of the most common use cases: fraud detection.
Elena Kosourova 's photo

Elena Kosourova

11 min

blog

How is AI Used in Manufacturing?

Explore the transformative role of AI in the manufacturing industry and understand how AI has revolutionized predictive maintenance, quality control, and supply chain management.
Kurtis Pykes 's photo

Kurtis Pykes

9 min

See MoreSee More