Machine Learning–Based Predictive Model for Subscription Fraud Detection in the Telecommunication Sector of Adamawa State, Nigeria
Umar Mohammed Pakra & Asabe Sandra AHMADU
Abstract
This study presents the design and implementation of a machine learning–based fraud detection system for the telecommunications sector in Adamawa State, Nigeria. The existing system primarily stored raw subscriber data, including call detail records (CDRs) and user profiles, but lacked the predictive capability to proactively identify fraudulent activities. To address this limitation, a new model was developed incorporating data extraction, preprocessing, dataset creation, and behavioral as well as profile evaluation. Random Forest and Adaboost algorithms were applied to detect anomalies and patterns indicative of fraud, with Python serving as the main development environment. Subscriber data from 1,000 records, obtained through stratified random sampling, formed the training and validation dataset. Evaluation metrics such as accuracy, precision, recall, F1 score, and ROC curves confirmed the robustness of the models, with Random Forest achieving an AUC of 0.91 and Adaboost 0.89. Feature importance analysis revealed that variables such as call duration, SMS frequency, payment plan, and international call charges were critical predictors of fraud. The system successfully flagged high-risk subscribers, demonstrating its utility for real-time fraud prediction and alert generation. The findings underscore the viability of machine learning as a proactive fraud detection tool, offering telecom operators improved accuracy, timely intervention, and enhanced customer trust. Recommendations include expanding behavioral variables, incorporating advanced algorithms such as XGBoost and LightGBM, and deploying the system in live telecom environments for continuous adaptation to evolving fraud strategies.
Keywords
References
techniques. International Journal of Information Security, 19(3), 315–332.
https://doi.org/10.1007/s10207-019-00449-5
Aggarwal, C. C. (2015). Data mining: The textbook. Springer.
Bakar, A. A., Mohemad, R., Ahmad, A., & Deris, M. M. (2006). A comparative study for outlier
detection techniques in data mining. In 2006 International Conference on IT and
Multimedia at UNITEN (pp. 1–6). IEEE. https://doi.org/10.1109/ICIMU.2006.289869
Buczak, A. L., & Guven, E. (2016). A survey of data mining and machine learning methods for
cyber security intrusion detection. IEEE Communications Surveys & Tutorials, 18(2),
1153–1176. https://doi.org/10.1109/COMST.2015.2494502
Cai, H., Zheng, V. W., & Chang, K. C. (2018). A comprehensive survey of graph embedding:
Problems, techniques, and applications. IEEE Transactions on Knowledge and Data
Engineering, 30(9), 1616–1637. https://doi.org/10.1109/TKDE.2018.2807452
Chen, Y., Wang, L., Wang, W., & Wu, T. (2018). Telecom fraud detection based on big data and
machine learning. In Proceedings of the 2018 IEEE 15th International Conference on
Networking, Sensing and Control (ICNSC) (pp. 1–6). IEEE.
https://doi.org/10.1109/ICNSC.2018.8361301
De Souza, J. M., & de Mello, R. F. (2020). A survey of deep learning techniques for fraud
detection. Artificial Intelligence Review, 53(6), 4129–4171.
https://doi.org/10.1007/s10462-019-09712-1
Deng, L., & Yu, D. (2014). Deep learning: Methods and applications. Foundations and Trends®
in Signal Processing, 7(3–4), 197–387. https://doi.org/10.1561/2000000039
Dey, A., Roy, A., & Das, A. (2019). Detecting telecommunications fraud using supervised
machine learning. International Journal of Engineering and Advanced Technology
(IJEAT), 8(6), 3670–3674. https://doi.org/10.35940/ijeat.F1003.088619
ET Telecom. (2024, March 5). Telecom fraud sees 20% rise in India in 2023, says TRAI.
ETTelecom.com. https://telecom.economictimes.indiatimes.com/news/telecom-fraudsees-20-rise-in-india-in-2023-says-trai/108276043
Fadlullah, Z. M., Tang, F., Mao, B., Kato, N., Akashi, O., Inoue, T., & Mizutani, K. (2017). Stateof-the-art deep learning: Evolving machine intelligence toward tomorrow’s intelligent
network traffic control systems. IEEE Communications Surveys & Tutorials, 19(4), 2432–
2455. https://doi.org/10.1109/COMST.2017.2707140
Feng, D., Wang, X., Li, Q., & Qian, Z. (2021). Fraud detection in telecom using graph-based deep
learning. Knowledge-Based Systems, 227, 107193.
https://doi.org/10.1016/j.knosys.2021.107193
Ghosh, S., & Reilly, D. L. (1994). Credit card fraud detection with a neural-network. In
Proceedings of the 27th Annual Hawaii International Conference on System Sciences (Vol.
3, pp. 621–630). IEEE. https://doi.org/10.1109/HICSS.1994.323314
Huang, J., & Ling, C. X. (2005). Using AUC and accuracy in evaluating learning algorithms. IEEE
Transactions on Knowledge and Data Engineering, 17(3), 299–310.
https://doi.org/10.1109/TKDE.2005.50
IEEE Communications Society. (2022). Emerging threats in telecom fraud. IEEE Communications
Magazine, 60(4), 12–18. https://doi.org/10.1109/MCOM.001.2100534Jha, S., Guillen, M., & Westland, J. C. (2022). Machine learning for cyber fraud detection in
telecommunications: Recent advances and challenges. Telecommunications Policy, 46(6),
102345. https://doi.org/10.1016/j.telpol.2022.102345
Kou, Y., Lu, C. T., Sirwongwattana, S., & Huang, Y. P. (2004). Survey of fraud detection
techniques. In IEEE International Conference on Networking, Sensing and Control (Vol.
2, pp. 749–754). IEEE. https://doi.org/10.1109/ICNSC.2004.1297040
Liu, F. T., Ting, K. M., & Zhou, Z. H. (2008). Isolation forest. In 2008 Eighth IEEE International
Conference on Data Mining (pp. 413–422). IEEE. https://doi.org/10.1109/ICDM.2008.17
Nguyen, N. P., Hoang, S. T., & Nguyen, H. Q. (2020). Application of machine learning techniques
in fraud detection in the telecom industry. International Journal of Advanced Computer
Science and Applications (IJACSA), 11(6), 456–462.
https://doi.org/10.14569/IJACSA.2020.0110658
Oyelade, O. J., & Ezugwu, A. E. (2020). Machine learning techniques for cyber fraud detection:
A survey. Computers & Security, 96, 101873. https://doi.org/10.1016/j.cose.2020.101873
Patil, M. S., & Thorat, S. S. (2021). Telecom fraud detection using data mining techniques. Journal
of Data Science and Management, 3(2), 19–27. https://doi.org/10.5958/2582-
7782.2021.00010.0
Qayyum, A., Qadir, J., Bilal, M., & Al-Fuqaha, A. (2017). Secure and robust machine learning for
healthcare: A review. IEEE Reviews in Biomedical Engineering, 14, 156–180.
https://doi.org/10.1109/RBME.2020.2969287
Rahman, M. M., Mollah, M. B., & Rahman, M. (2020). Real-time telecom fraud detection using
machine learning techniques. In 2020 International Conference on Computer,
Communication, Chemical, Materials and Electronic Engineering (IC4ME2) (pp. 1–4).
IEEE. https://doi.org/10.1109/IC4ME2.2019.9036657
Saini, H., Bhatia, P. K., & Kumar, R. (2021). Detection and prevention of telecom fraud using
hybrid deep learning. International Journal of Intelligent Systems and Applications, 13(2),
1–10. https://doi.org/10.5815/ijisa.2021.02.01
TRAI. (2023). Measures to curb telecom frauds. Telecom Regulatory Authority of India.
https://www.trai.gov.in/sites/default/files/Press_Release_21072023_0.pdf
Zhang, Y., Zhao, Q., & Li, J. (2019). Detecting telecom fraud using big data platform and deep
learning. Procedia Computer Science, 147, 561–566.
https://doi.org/10.1016/j.procs.2019.01.210