Using machine learning and stream computing to detect financial fraud
How IBM Research can help companies save billions annually
Fraud costs the financial industry approximately $80 billion annually1; U.S. credit and debit card issuers alone lost $2.4 billion2. For individual victims of fraud, the experience can be costly and even lead to identity theft, which can take years to resolve. But by using machine learning and stream computing to create virtual "data detectives," IBM researchers are working to reduce the risk.
Existing fraud detection systems operate on a set of rules, such as flagging ATM withdrawals over a certain amount or credit card purchases that take place outside a card holder's home country. While this method helps stop a large number of fraudulent cases, a team of researchers in the Machine Learning Technologies group at IBM Research - Haifa are taking fraud prevention and detection to a new level with the IBM Detecting Fraud in Financial Transactions solution.
A new kind of fraud detection
Rather than singling out specific types of transactions, the solution analyzes historical transaction data to build a model that can detect fraudulent patterns. This model is then used to process and analyze a large amount of financial transactions as they happen in real time, also known as stream computing.
Each transaction is given a fraud score, which represents the probability of a transaction being fraudulent. The model is first customized to the client's data and then updated periodically to cover new fraud patterns. The underlying analytics rely on statistical analysis and machine learning methods, some of the same techniques as used by IBM's Watson question answering computer system, in training for its match on the US quiz show Jeopardy!. These techniques enable finding abnormal fraud patterns that would be missed by human experts.
Take for example, a purchase made at an online clothing retailer. If many of the past transactions at the retailer were fraudulent, then future purchases and transactions would also have a high probability of being fraudulent. The system is able to pick up on these historical points of data and synthesize them into possible scenarios for future fraud attempts.
In addition to preventing fraud, the system cuts down on false alarms by analyzing the connection between transactions that are suspected to be fraudulent and actual fraud. When applied to a leading global bank, it led to a 40% decrease in false alarms on transactions from the bank's e-payments system.
"The triple combination of prevention, catching more incidents of actual fraud, and reducing the number of false positives results in maximum savings with minimal hassle," notes Yaara Goldschmidt, manager, Machine Learning Technologies group. "In essence, we are able to apply complicated logic that is outside the realm of human analysis to huge quantities of streaming data."
These machine learning technologies are currently used in a number of client engagements and can detect and prevent fraud in a variety of financial transactions, including credit cards, ATMs, and e-payments. The system is built into the client's infrastructure and a machine learning model is created using existing data and retrained as conditions change, forming an integrated system that allows the client to combat fraud before it happens.
"By identifying legal transactions that have a high probability of being followed by a fraudulent transaction, a bank can take pro-active measures—warn a card owner or require extra measures for approving a purchase," explains Dan Gutfreund, project technical lead.
While machine learning and stream computing technologies can't predict the future, they allow financial institutions to make smarter decisions and work to prevent fraud before it happens.
2 Coalition Against Insurance Fraud (CAIF)