RESEARCH JOURNAL OF PURE SCIENCE AND TECHNOLOGY (RJPST )

E-ISSN 2579-0536
P-ISSN 2695-2696
VOL. 7 NO. 4 2024
DOI: 10.56201/rjpst.v7.no4.2024.pg1.12


Comparative Analysis of Random Forest and Logistic Regression Models for Detecting Fraud in Bank Transactions Based on Performance Metrics

Mohammed, Usman, Professor G. M. Wajiga, Auwal Nata’ala, Bilyaminu Muhammad Abdullahi


Abstract


This study explores developing and evaluating machine learning models for detecting fraudulent bank transactions. By analyzing transaction data, features such as transaction type, amount, balance, and date are extracted and labeled as genuine or fraudulent based on balance consistency and transaction limits. The dataset is split into training and testing sets, and two models—Random Forest and Logistic Regression—are trained using standardized features. The models are evaluated on accuracy, precision, recall, and F1-score metrics. Results indicate that the Random Forest model outperforms Logistic Regression in terms of accuracy due to its ability to handle complex relationships within the data. However, Logistic Regression offers valuable probabilistic insights. Challenges such as data imbalance and feature extraction quality are addressed with techniques like Synthetic Minority Over-sampling Technique (SMOTE) and advanced preprocessing methods. Prediction probabilities are visualized using Matplotlib for better interpretation. Future work includes enhancing feature extraction, expanding the dataset, and exploring more advanced models to further improve performance. This study demonstrates the potential of combining multiple validation techniques and machine learning models with a user- friendly interface to create a robust solution for detecting fraudulent bank transactions, thereby enhancing financial security.


keywords:

Fraud detection, Machine learning models, Random Forest, Logistic Regression and Bank transactions.


References:


Abdallah, R., Gaber, M. M., & Srinivasan, B. (2020). Machine Learning-Based Fraud Detection
in Financial Services: A Systematic Review. ACM Computing Surveys (CSUR), 53(6), 1-
Bolton, R. J., & Hand, D. J. (2002). Statistical Fraud Detection: A Review. Statistical
Science, 17(3), 235-249.

Fawcett, T. (2006). An Introduction to ROC Analysis. Pattern Recognition Letters, 27(8), 861-
Krawczyk, B. (2016). Learning from Imbalanced Data: Open Challenges and Future
Directions. Progress in Artificial Intelligence, 5(4), 221-232.

Phua, C., Lee, V., Smith, K., & Gayler, R. (2010). A Comprehensive Survey of Data Mining-
Based Fraud Detection Research. arXiv preprint arXiv:1009.6119.

Poojitha, S., & Malathi, K. (2022). An Innovative Method to Enhance the Accuracy of Credit Card
Fraud Detection Using Logistic Regression Algorithm by Comparing Random Forest
Algorithm. ECS Trans., 107, 14205. https://doi.org/10.1149/10701.14205ecst

Sharma, P., Banerjee, S., Tiwari, D., & Patni, J. C. (2021). Machine Learning Model for Credit
Card Fraud Detection - A Comparative Analysis. The International Arab Journal of
Information Technology, 18(6), 789.

Tucker, J. (2019). Financial Fraud Detection Using Machine Learning. Journal of Big Data, 6(1),
1-19.


DOWNLOAD PDF

Back