Ensemble synthesized minority oversampling-based generative adversarial networks and random forest algorithm for credit card fraud detection.

The recent increase in credit card fraud is rapidly has caused huge monetary losses for individuals and financial institutions. Most credit card frauds are conducted online by illegally obtaining payment credentials through data breaches, phishing, or scamming. Many solutions have been suggested to...

Full description

Saved in:
Bibliographic Details
Main Authors: Ghaleb, Fuad A., Saeed, Faisal, Al-Sarem, Mohammed, Qasem, Sultan Noman, Al-Hadhrami, Tawfik
Format: Article
Language:English
Published: Institute of Electrical and Electronics Engineers Inc. 2023
Subjects:
Online Access:http://eprints.utm.my/104903/1/FuadAGhaleb2023_EnsembleSynthesizedMinorityOversamplingBased.pdf
http://eprints.utm.my/104903/
http://dx.doi.org/10.1109/ACCESS.2023.3306621
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The recent increase in credit card fraud is rapidly has caused huge monetary losses for individuals and financial institutions. Most credit card frauds are conducted online by illegally obtaining payment credentials through data breaches, phishing, or scamming. Many solutions have been suggested to address the credit card fraud problem for online transactions. However, the high-class imbalance is the major challenge that faces the existing solutions to construct an effective detection model. Most of the existing techniques used for class imbalance overestimate the distribution of the minority class, resulting in highly overlapped or noisy and unrepresentative features, which cause either overfitting or imprecise learning. In this study, a credit card fraud detection model (CCFDM) is proposed based on ensemble learning and a generative adversarial network (GAN) assisted by Ensemble Synthesized Minority Oversampling techniques (ESMOTE-GAN). Multiple subsets were extracted using under-sampling and SMOTE was applied to generate less skewed sets to prevent the GAN from modeling the noise. These subsets were used to train diverse sets of GAN models to generate the synthesized subsets. A set of Random Forest classifiers was then trained based on the proposed ESMOTE-GAN technique. The probabilistic outputs of the trained classifiers were combined using a weighted voting scheme for decision-making. The results show that the proposed model achieved 1.9%, and 3.2% improvements in overall performance and the detection rate, respectively, with a 0% false alarm rate. Due to the massive number of transactions, even a tiny false positive rate can overwhelm the analysis team. Thus, the proposed model has improved the detection performance and reduced the cost needed for manual analysis.