Optimizing Data Evaluation Metrics for Fraud Detection Using Machine Learning

The use of technology has benefited society in more ways than one ever thought possible. Unfortunately, as society’s knowledge of technology has advanced, so has its knowledge of ways to use technology to manipulate others. This has led to a simultaneous advancement in the world of fraud. Machine learning techniques can offer a possible solution to help decrease these advancements. This research explores how the use of various machine learning techniques can aid in detecting fraudulent activity across two different types of fraudulent datasets, and the accuracy, precision, recall, and F1 were recorded for each method. Each machine learning model was also tested across five different training and testing splits in order to discover which split and technique would lead to the most optimal results.





References:
[1] James, Gareth, et al. An Introduction to Statistical Learning with
Applications in R. Springer, 2021.
[2] William Ezekiel and Umashanger Thayasivam. ”A Comparison of
Supervised Learning Techniques for Clustering” Neural Information
Processing Vol. 9489 (2015) p. 476 - 483
[3] Fabrizio Carcillo, Yann-A¨el Le Borgne, Olivier Caelen,
Yacine Kessaci, Fr´ed´eric Obl´e, Gianluca Bontempi, Combining
unsupervised and supervised learning in credit card fraud
detection, Information Sciences, Volume 557, 2021, Pages
317-331, ISSN 0020-0255, https://doi.org/10.1016/j.ins.2019.05.042.
(https://www.sciencedirect.com/science/article/pii/S0020025519304451)
[4] Bilen, A., & Ahmet Bedri O¨ zer. (2021). Cyber-attack method
and perpetrator prediction using machine learning algorithms. PeerJ
Computer Science, doi:http://dx.doi.org/10.7717/peerj-cs.475
[5] Siddhant Bagga, Anish Goyal, Namita Gupta, Arvind Goyal, Credit
Card Fraud Detection using Pipeling and Ensemble Learning,
Procedia Computer Science, Volume 173, 2020, Pages 104-112,
ISSN 1877-0509, https://doi.org/10.1016/j.procs.2020.06.014.
(https://www.sciencedirect.com/science/article/pii/S1877050920315167)
[6] Serhiy Hnatyshyn, Umashanger Thayasivam, Vasil Hnatyshin and
Curtis White. ”Machine learning algorithms for metabolomics
applications” LondonIdentification and Data Processing
Methods in Metabolomics (2015) p. 96 - 110 Available at:
http://works.bepress.com/umashanger-thayasivam/12/
[7] Hajjami, S. , Malki, J. , Bouju, A. , Berrada, M.. ”Machine Learning
Facing Behavioral Noise Problem in an Imbalanced Data Using One Side
Behavioral Noise Reduction: Application to a Fraud Detection”. World
Academy of Science, Engineering and Technology, Open Science Index
171, International Journal of Computer and Information Engineering
(2021), 15(3), 194 - 205.
[8] Bilen, Abdulkadir and Ahmet Bedri O¨ zer. 2021. ”Cyber-Attack
Method and Perpetrator Prediction using Machine
Learning Algorithms.” PeerJ Computer Science (Apr 09).
doi:http://dx.doi.org.ezproxy.rowan.edu/10.7717/peerj-cs.475.
http://ezproxy.rowan.edu/login?qurl=https%3A%2F%2Fwww.proquest.com
%2Fscholarly-journals%2Fcyber-attack-method-perpetrator-prediction
-using%2Fdocview%2F2510490837%2Fse-2%3Faccountid%3D13605.
[9] Fahrmeir, L. and Tutz, G. (1994), Multivariate Statistical Modelling
Based on Generalized Linear Models, Springer.
[10] Nisbet, R., Elder, J. and Miner, G. (2011), Handbook of Statistical
Analysis and Data Mining Applications, Academic Press.
[11] Tuff’ery, S. (2011), Data Mining and Statistics for Decision Making,
Wiley.