What are scoring models?

Every company wants to understand their clients’ financial behaviors and situations, especially before agreeing to something like issuing that client a loan or lending an expensive piece of equipment.

And many companies are increasingly realizing that open banking is the way to accomplish that goal.

Open banking tools make it much easier and more efficient for a company to access all kinds of financial behavioral information – from daily spending habits to outstanding debts.

But it’s not enough for companies to just access the data, they need to be able to understand it as well. Raw financial data can be difficult and time consuming to pull apart, categorize and analyze.

That’s where scoring models come in.

Scoring models analyze collections of data to return a single metric indicating how creditworthy a potential client is, thus saving companies enormous amounts of time, as well as providing them with a trustworthy and tested data analysis service.

So how, exactly, do scoring models provide such an accurate service?

Let’s dive into the details

‍

From transaction data to score

When an open banking service provider like Kontomatik is asked to analyze banking data for a company, the information they receive includes financial transactions, amounts, titles, balance and more.

Taking that data, the provider will first apply a Machine Learning algorithm, trained on hundreds of thousands of bank transactions, which will go through each of the potential client’s financial transactions one by one to identify keywords and phrases and determine how to label each transaction.

Providers that use scoring models have many labels; Kontomatik, for instance, has over 70 labels that range from items like “grocery” to “rent” to “welfare” and more. Each transaction is not limited to one label – in fact, they may have multiple labels depending on the type of transaction. Read a more detailed overview of transaction labeling here.

Labels make the transactions easier to understand and analyze during the next step: scoring.

‍

Scoring models take those transaction labels and analyze them in order to return a score – similar to a grade in school – that indicates how creditworthy that particular customer is. A good score (A or B) indicates that a client is very trustworthy when it comes to repaying their liabilities, and that they have a strong history of paying debts on time as well as making sound financial decisions.

A poor score (F) indicates that a client is riskier to lend to based on their financial history. This could mean they have a record of making poor financial decisions, have frivolous spending activity, or simply neglect to pay liabilities on time.

Though the decision of whether to work with the client lies entirely with the company, the scoring model helps a company decide what services – if any – to offer that customer.

‍

How does the scoring process work?

Kontomatik’s scoring model is based on Machine Learning algorithms, and uses both behavioral and aggregate features to determine a client’s creditworthiness.

First, the scoring model will go through the client’s labeled transactions to assess behavioral features like changes in spending behavior; the frequency of risky spending (i.e. gambling) and thus, the likelihood of future risky spending; the amount of undocumented and informal income; an increase in frivolous spending and more.

‍

The point of the behavioral assessment

... is to understand how a client may spend their money in the future based on their current and previous transactions and their history of risky financial decisions.

In addition to the behavioral features, a scoring model will assess aggregate features, which refer to broader trends in financial behavior versus the more specific transactions that indicate behavior.

For instance, aggregate features might include the client’s average account balance, the number of loans they apply for in a given time, the presence of welfare and more.

‍

Factors get weighed against each other

These factors – both behavioral and aggregate – get weighed against each other in the scoring process, with some factors having a positive impact on the client’s score (for instance, if they have a good history of repaying their liabilities) and some decreasing the client’s score (for instance, if they regularly maintain a low bank balance).

Once all of these factors are weighed against each other, the process returns a single metric.

The scoring process is constantly undergoing checks and improvements to ensure it is up-to-date and as accurate as possible.

In fact, Kontomatik gathers feedback from lenders about new observations about clients on a regular basis in order to enrich the model with more data, thus improving its accuracy.

‍

What gets returned?

As discussed above, the main information that gets returned from the scoring process is the client’s probability of repayment (in an easy-to-understand percentage). It also returns a population percentile. The population percentile is a good way to compare potential clients.

For example

If a potential client receives a percentile score of .7, that means their creditworthiness is higher than or equal to 70% of the population in our training dataset.

But there’s much more a company can learn from the scoring process beyond just that percentage.

‍

“We are able to explain every aspect of the score that gets returned from our scoring model, including each model’s decision, which features are most important and why, and how they contributed to a lower or higher score for your potential client”

- Piotr Podlewski, Head of Data Science for Kontomatik

‍

Along with the percentage, the model returns a scoring tier, which is a letter grade on a scale of A to F, indicating how reliable a client may be when it comes to meeting their liabilities. This letter is based on the analysis of those aggregate and behavioral features the model identified in a potential client’s banking history.

‍

All in all, the scoring solution results in a clear answer about a client’s creditworthiness, as well as a detailed understanding of every factor that led to that answer.

‍

Customizing the scoring model

One of the biggest benefits of scoring models is that they are easy to customize. This means a company can decide which features in a client’s financial history they want to take into account, which to focus on, which to weigh as more important factors than others, and which minimum requirements a client must meet.

‍

➔ Defining the parameteres

First, a company will need to define their parameters: what are they looking for in a client? Which bare minimum requirements does a client need to meet to be considered for, as an example, a loan?

This can be requirements like no bailiff transactions, or a monthly income that’s higher than monthly spending.

➔ Training dataset

Companies are also encouraged to take part in the decision process to adjust a training dataset.

For instance, they can suggest what clients should be included in the dataset, loans with what range of amounts should be used in the model, and more.

‍

➔ Adjusting the threshold

Companies will also need to adjust the threshold of acceptance to their risk policy– i.e. will they grant a loan to someone who receives an E on the scoring tier? Or a .4 percentile?

‍

How it works in Kontomatik?

In Kontomatik’s scoring model customization process, we can actually work with companies that are unclear about what requirements to choose and where to place the minimum threshold, making the process easier.

Then, with those requirements in place, a scoring system is put together and tested against a target group of clients, with the model being adapted and improved if necessary.

Once that process is complete, companies have their own scoring system, specified to their exact conditions and parameters, which they can use to assess clients going forward.

‍

Ultimately,

scoring models have the ability to compound complex financial behaviors into one simple answer to the question plaguing lending companies:

how trustworthy is your client?

‍

What are scoring models and how do they work?