Optimizing Data Evaluation Metrics for Fraud Detection Using Machine Learning

The use of technology has benefited society in more ways than one ever thought possible. Unfortunately, as society’s knowledge of technology has advanced, so has its knowledge of ways to use technology to manipulate others. This has led to a simultaneous advancement in the world of fraud. Machine learning techniques can offer a possible solution to help decrease these advancements. This research explores how the use of various machine learning techniques can aid in detecting fraudulent activity across two different types of fraudulent datasets, and the accuracy, precision, recall, and F1 were recorded for each method. Each machine learning model was also tested across five different training and testing splits in order to discover which split and technique would lead to the most optimal results.

A Comprehensive Survey on Machine Learning Techniques and User Authentication Approaches for Credit Card Fraud Detection

With the increase of credit card usage, the volume of credit card misuse also has significantly increased, which may cause appreciable financial losses for both credit card holders and financial organizations issuing credit cards. As a result, financial organizations are working hard on developing and deploying credit card fraud detection methods, in order to adapt to ever-evolving, increasingly sophisticated defrauding strategies and identifying illicit transactions as quickly as possible to protect themselves and their customers. Compounding on the complex nature of such adverse strategies, credit card fraudulent activities are rare events compared to the number of legitimate transactions. Hence, the challenge to develop fraud detection that are accurate and efficient is substantially intensified and, as a consequence, credit card fraud detection has lately become a very active area of research. In this work, we provide a survey of current techniques most relevant to the problem of credit card fraud detection. We carry out our survey in two main parts. In the first part, we focus on studies utilizing classical machine learning models, which mostly employ traditional transnational features to make fraud predictions. These models typically rely on some static physical characteristics, such as what the user knows (knowledge-based method), or what he/she has access to (object-based method). In the second part of our survey, we review more advanced techniques of user authentication, which use behavioral biometrics to identify an individual based on his/her unique behavior while he/she is interacting with his/her electronic devices. These approaches rely on how people behave (instead of what they do), which cannot be easily forged. By providing an overview of current approaches and the results reported in the literature, this survey aims to drive the future research agenda for the community in order to develop more accurate, reliable and scalable models of credit card fraud detection.

The Application of Fuzzy Set Theory to Mobile Internet Advertisement Fraud Detection

This paper presents the application of fuzzy set theory to implement of mobile advertisement anti-fraud systems. Mobile anti-fraud is a method aiming to identify mobile advertisement fraudsters. One of the main problems of mobile anti-fraud is the lack of evidence to prove a user to be a fraudster. In this paper, we implement an application by using fuzzy set theory to demonstrate how to detect cheaters. The advantage of our method is that the hardship in detecting fraudsters in small data samples has been avoided. We achieved this by giving each user a suspicious degree showing how likely the user is cheating and decide whether a group of users (like all users of a certain APP) together to be fraudsters according to the average suspicious degree. This makes the process more accurate as the data of a single user is too small to be predictable.

A Rule-based Approach for Anomaly Detection in Subscriber Usage Pattern

In this report we present a rule-based approach to detect anomalous telephone calls. The method described here uses subscriber usage CDR (call detail record) data sampled over two observation periods: study period and test period. The study period contains call records of customers- non-anomalous behaviour. Customers are first grouped according to their similar usage behaviour (like, average number of local calls per week, etc). For customers in each group, we develop a probabilistic model to describe their usage. Next, we use maximum likelihood estimation (MLE) to estimate the parameters of the calling behaviour. Then we determine thresholds by calculating acceptable change within a group. MLE is used on the data in the test period to estimate the parameters of the calling behaviour. These parameters are compared against thresholds. Any deviation beyond the threshold is used to raise an alarm. This method has the advantage of identifying local anomalies as compared to techniques which identify global anomalies. The method is tested for 90 days of study data and 10 days of test data of telecom customers. For medium to large deviations in the data in test window, the method is able to identify 90% of anomalous usage with less than 1% false alarm rate.