COAS
Center for Open Access in Science (COAS)
OPEN JOURNAL FOR INFORMATION TECHNOLOGY (OJIT)

ISSN (Online) 2620-0627 * ojit@centerprode.com

OJIT Home

2023 - Volume 6 - Number 2


A Hybrid Model for Detecting Insurance Fraud Using K-Means and Support Vector Machine Algorithms

Brian Ndirangu Muthura * ORCID: 0009-0000-0559-4301
Kenyatta University, School of Engineering and Technology, Nairobi, KENYA

Abraham Matheka * ORCID: 0009-0000-0559-4301
Kenyatta University, School of Engineering and Technology, Nairobi, KENYA

Open Journal for Information Technology, 2023, 6(2), 143-156 * https://doi.org/10.32591/coas.ojit.0602.05143m
Received: 8 August 2023 ▪ Revised: 4 October 2023 ▪ Accepted: 15 November 2023

LICENCE: Creative Commons Attribution 4.0 International License.

ARTICLE (Full Text - PDF)


ABSTRACT:
Private stakeholders and governments across the globe are striving to improve the quality and access of healthcare services to citizens. The need to improve healthcare services, coupled with the increase in social awareness and improvement of people’s living standards, has seen an increase in medical policyholders in the insurance industry. Even so, the healthcare sector is grappled with increased costs every other year, leading to revision of premiums and increased costs for the policyholders. One of the main factors contributing to the increased costs is fraudulent claims raised by the service providers and the policyholders, leading to unprecedented risks and losses for insurance firms. The insurance industry has set up fraud detection and mitigation systems to mitigate losses brought about by fraudulent claims, which come in two flavors: rule-based systems and expert claims analysis. With rule-based systems, conditions such as missing details, location of the claim vis a vis the location of the policyholder, among other rules, are evaluated by systems to assess the validity of the claims. On the other hand, insurance firms rely on the human intervention of experts using statistical analyses and artificial rules to detect fraudulent claims. The rule-based and expert analysis methods fail to detect patterns or anomalies in claims, which is central to efficient fraud detection. Data mining and machine learning techniques are being leveraged to detect fraud. This automation presents enormous opportunities for identifying hidden patterns for further analysis by insurance firms. This research aims to analyze a hybrid approach to detect medical insurance fraud using both K-Means (unsupervised) and Support Vector Machines (supervised) machine learning algorithms.

KEY WORDS: fraud detection, machine learning, K-Means, support vector machines, hybrid algorithms.

CORRESPONDING AUTHOR:
Brian Ndirangu Muthura, Kenyatta University, School of Engineering and Technology, Nairobi, KENYA.


REFERENCES:

Abdallah, A., Maarof, M., & Zainal, A. (2016). Fraud Detection System: A Survey. Journal of Network and Computer Applications, 90-113.

Abdi, H., & Williams, L. (2010). Principal component analysis. Wiley Interdisciplinary Reviews: Computational Statistics, 2(4), 433-459.

Ai, J., Lieberthal, R., Skyla, S., & Wojciechowski, R. (2018). Examining predictive modeling–based approaches to characterizing health care fraud. Society of Actuaries. https://www.soa.org/resources/research-reports/2018/healthcare-fraud.

Altman, N. S., & Krzywinski, M. (2017). Points of significance: Classification evaluation. Nature Methods, 14(8), 755-756.

Association of Certified Fraud Examiners (2019). Insurance Fraud Handbook. Association of Certified Fraud Examiners, Inc.

Association of Kenya Insurers (2020). 2020 Insurance Industry Report. Nairobi: Association of Kenya Insurers.

Association of Kenya Insurers (2021). Information Paper on Insurance Fraud. Nairobi: Association of Kenya Insurers.

Bauder, R., Khoshgoftaar, T., & Seliya, N. (2017). A Survey on the state of healthcare upcoding fraud analysis and detection. Health Services & Outcomes Research, 31-55.

Bergstra, J., & Bengio, Y. (2012). Random Search for Hyper-Parameter Optimization. Journal of Machine Learning Research, 281-305.

Carcillo, F., Le Borgne, Y.-A., Caelen, O., Kessaci, Y., Obleb, F., & Bontempi, G. (2021, May). Combining unsupervised and supervised learning in credit card fraud. Business Analytics Emerging Trends and Challenges, 557, 317-331.

Gupta, R. Y., Mudigonda, S. S., & Baruah, P. K. (2021, March). A comparative study of using various machine learning and deep learning-based fraud detection models for universal health coverage. International Journal of Engineering Trends and Technology, 96-102.

Hanafy, M., & Ming, R. (2021). Using machine learning models to compare various resampling methods in predicting insurance fraud. Journal of Theoretical and Applied Information Technology, 99(12), 2819-2833.

Joudaki, H., Rashidian, A., Minaei-Bidgoli, B., Mahmoodi, M., Geraili, B., Nasiri, M., & Arab, M. (2015). Using data mining to detect health care fraud and abuse: A review of literature. Global Journal of Health Science, 194-202.

Kose, I., Gokturk, M., & Kilic, K. (2015). An interactive machine-learning-based electronic fraud and abuse detection system in healthcare insurance. Applied Soft Computing Journal, 36, 283-299. https://doi.org/10.1016/j.asoc.2015.07.018

Lawand, S., & Kulkarni, U. (2019). Survey on fraud prediction for an application using data mining. International Journal of Emerging Technologies and Innovative Research, 6(6), 209-212. http://doi.one/10.1729/Journal.22988

Matloob, I., & Khan, S. (2019). A framework for fraud detection in government supported national healthcare programs. Electronics, Computers and Artificial Intelligence, ECAI 2019. Romania.

Matloob, I., Khan, S., ur Rahman, H., & Hussain, F. (2020). Medical health benefits management system for real-time notification of fraud using historical medical records. Applied Sciences, 10(15). https://doi.org/10.3390/app10155144

Naik, J., & Laxminarayana, A. (2017). Designing hybrid model for fraud detection in insurance. In National Conference on Advances in Computational Biology, Communication, and Data Analytics, 24-30.

Ogbuabor, G., & Ugwoke, F. (2018). Clustering algorithm for a healthcare dataset using silhouette score value. International Journal of Computer Science & Information Technology, 10(2), 27-37.

Rawte, V., & Anuradha, G. (2015). Fraud detection in health insurance using data mining techniques. In 2015 International Conference on Communication, Information & Computer Technology (ICCICT).

Schröer, C., Kruse, F., & Gómez, J. (2021). A systematic literature review on applying CRISP-DM process model. Procedia Computer Science, 526-534.

Segal, S. Y. (2016). Accounting frauds – Review of advanced technologies to detect and. Economics and Business Review, 45-64.

Waghade, S. S., & Karandikar, A. (2018). A comprehensive study of healthcare fraud detection based on machine learning. Nagpur: International Journal of Applied Engineering Research. Retrieved from https://www.ripublication.com/ijaer18/ijaerv13n6_140.pdf.

Wakoli, L., Orto, A., & Mageto, S. (2014). Application of the K-means clustering algorithm in medical claims fraud / abuse algorithm in medical claims fraud / abuse detection. International Journal of Application or Innovation in Engineering & Management, 3(7), 142-151.

Zhang, C., Xiao, X., & Wu, C. (2020). Medical fraud and abuse detection system based on machine learning. International Journal of Environmental Research and Public Health, 17(7265), 1-11.

Zhang, Y., & Ma, S. (2020). Ensemble machine learning: Methods and applications. Springer.

Zhou, S., & Zhang, R. (2020). A novel method for mining abnormal expenses in social medical insurance. International IoT, Electronics, and Mechatronics Conference, Proceedings. Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1109/IEMTRONICS51293.2020.9216354

Zhou, S., He, J., Yang, H., Chen, D., & Zhang, R. (2020). Big data-driven abnormal behavior detection in healthcare based on association rules. IEEE Access, 129002–129011. https://doi.org/10.1109/ACCESS.2020.3009006

 

© Center for Open Access in Science