#Hashtags in finance. What are they and how does transaction labeling work?

When people discuss sharing financial information under open banking, much of the focus is generally on user experience and access; how financial service providers get access to banking information, how that information is kept secure, who has access with the client’s consent, and more.

But there are other key questions that often don’t get enough attention – questions like what happens after an account information service provider gains access to the client’s banking information (as always, with their consent)?

How does a mountain of data that consists of months upon months of financial transactions turn into a clear, exhaustive analysis of a client’s financial habits?

Simply put, how do experts make sense of the banking information they receive?

It all starts with transaction labeling

‍

Why do we need transaction labeling?

When a client agrees to share their banking data with a third party – for instance, a loan provider – the process is relatively straightforward on their end.

The client will use two methods of identity verification before agreeing to allow their bank to provide their financial history with the third party, which will use it to assess things like their financial health and creditworthiness.

It all happens in a matter of clicks.

But there’s a lot that goes on behind the scenes; after getting consent from the client, a third party will often enlist the help of an AISP - Account Information Service Provider (like Kontomatik) to gather the account data from the client’s bank via an API.

The information that the AISP receives from the client’s financial institution is a wealth of data containing everything from monthly deposits to financial transactions like rent payment, grocery shopping and more. (It’s worth noting that this information is anonymized when it’s received by the AISP.)

While having all that data is valuable, it’s hard to conclude anything about financial habits or creditworthiness initially. That’s because when the data is received, it’s a raw series of transactions, meaning numbers and payments that are uncategorized and therefore, difficult to analyze.

In short, the information needs to be sorted before it can be understood.

So experts have developed machine learning tools that use algorithms to apply labels to each individual transaction. The labels, which include categories like “rent payment” “Grocery bill” and “salary”, can help turn the raw data into something that can be read, understood and – most importantly – analyzed easily.

‍

How does labeling work?

Before we get into the details about the machine learning process, let’s examine what the actual labeling system looks like.

Kontomatik has over 70 labels that include repayment, loan, salary, grocery, welfare, and many more.

Each piece of data or bank transaction can often get assigned multiple labels or categories, kind of like hashtags on Twitter. For instance, a payment to a lender could be categorized as both a repayment and a loan.

After getting access to the client’s account history, machine learning tools will go through the data and use a special algorithm to identify keywords in each transaction, which the system will then assign weights to.

The weight of the different keywords refers to the probability that that transaction falls under a certain label or category. The machine learning system compares the different weighted keywords against each other to determine which label the transaction should receive.

For instance, if the machine learning algorithm identifies several keywords in one transaction it will assign weights to those keywords to indicate they may be indicative of either a grocery bill or a restaurant bill. When the system then assesses the keywords together as a whole, it may note that the weight of the keywords more heavily leans toward the transaction being a grocery bill over a restaurant bill and thus the system will assign it a “grocery” label.

‍

The keywords serve another purpose ...

... as well – they’re a kind of path or explanation if analysts are ever curious about why the machine learning tools assigned a transaction a specific label. The use of keywords in the labeling process provides a clear explanation.

This may sound time-consuming, but that’s the beauty of machine learning; with a set system, the keyword identification and labeling process happens very quickly.

It also removes the risk of human bias or error in the identification process.

‍

How are the labels used?

After that intensive labeling process, the data set from a user’s bank account is much clearer.

Now, the transactions are all broken down into specific categories like “grocery” “internet bill” “rent” “restaurant” and more.

But how does that labeled data help analyze a person’s financial history and health?

This is where a scoring algorithm comes into play.

Kontomatik’s algorithm uses the labeled transactions and compares them to each other, determining how frequently the client spends money on things like groceries, or gambling, or shopping. In addition to the frequency of certain purchases, the algorithm also assesses changes in financial behavior like an increase in frivolous spending, or a decrease in regular debt repayments.

After a thorough analysis of the data, the algorithm returns a single probability score, determining a client’s creditworthiness, as well as a letter grade (on a scale from A to F) that helps companies easily and quickly understand whether a particular client is likely to repay their loans based on their spending habits.

The result is a clear, easy to understand answer to the question often plaguing many lenders and other service providers: should we work with this client based on their financial history?

And the answer all starts with labeling.

‍

For more updates and analysis on open banking, follow Kontomatik on LinkedIn

‍

#Hashtags in finance: What are they and how does transaction labeling work?