An Improved Generative Adversarial Network with Feature Filtering for Imbalanced Data

: Generative adversarial network (GAN) is an overwhelming yet promising method to address the data imbalance problem. However, most existing GANs that are usually inspired by computer vision techniques have not yet taken the significance and redundancy of features into consideration delicately, probably producing rough samples with overlapping and incorrectness. To address this problem, a novel GAN called improved GAN with feature filtering (IGAN-FF) is proposed, which establishes a new loss function for the model training by replacing the traditional Euclidean distance with the Mahalanobis distance and taking the -norm regularization term into consideration. The remarkable merits of the proposed IGAN-FF can be highlighted as follows: 1) the utilization of the Mahalanobis distance can make a fair evaluation of different attributes without neglecting any trivial/small-scale but significant ones. In addition, it can mitigate the disturbance caused by the correlation between features; 2) the embedding of -norm regularization term into the loss function contributes greatly to the feature filtering by guaranteeing the data sparsity as well as helps reduce risk of overfitting. Finally, empirical experiments on 16 well-known imbalanced datasets demonstrate that our proposed IGAN-FF performs better at most evaluation metrics than the other 11 state-of-the-art methods.


Introduction
With the rapid growth of data types and data volumes in the information explosion era, data processing has emerged as a vitally relevant research topic in the field of data analytics.Classification is an important computer task in machine learning and has led to significant advances in areas such as medical diagnosis [1,2], fault detection [3−5], and financial fraud [6−8].It is worth mentioning that the success of these approaches is heavily dependent on balanced data categories.This is not always the case in classification problems, as some classes have significant differences in the number of classes between them due to their low frequency of occurrence.As a result, the bigger the imbalance ratio, the greater the difficulty in classification.To tackle the challenging issue of imbalanced data classification, a significant amount of work has been dedicated to addressing such challenging work and a body of promising results has been reported in the literature [9−13], which can be roughly divided into two categories: model-oriented methods and data-oriented methods.The former usually involves developing efficient models directly by scrutinizing the intrinsic characteristics of the data, such as imbalance ratios, without changing the amount of data.Nevertheless, the classification performance of these approaches turns out to be unsatisfactory under overlap between classes and extreme imbalance.As far as the data-oriented approach is concerned, the primary concept is to regulate the sample size to achieve a balance between the different classes by means of sampling.
Among the data-oriented methods, oversampling and undersampling are two typical coping tools.However, given that undersampling leads to loss of information and in turn affects the classification performance, an increasingly prominent approach in addressing classification problems with imbalanced datasets is oversampling.This method is considered effective as it not only restores data balance but also preserves the inherent characteristics of the original data.Nonetheless, most oversampling methods, such as the synthetic minority oversampling technique (SMOTE) [14] and its variants generate new samples by interpolating between an existing sample and its K-nearest neighbors, which often leads to ambiguous sample features inconsistent with the original distribution.So far, many researchers have been interested in the study of data distribution recently during oversampling.In [15], K-means SMOTE considers data distribution by introducing the idea of clustering into the oversampling strategy of assigning weights.Similarly, MWMOTE [16] assigns weights to each minority cluster appropriately, not generating instances beyond the range of clusters to ensure the safety of the synthetic instances.Dai et al. [17] synthesized new samples based on the weighted distribution of two factors including the inter-class distance and the cluster capacity.Although these methods can take the data distribution into account in some sense, it is still difficult to ensure that the synthesized samples strictly conform to the original distribution for datasets with complex distributions, which directly gives rise to one of the motivations of this paper.
Recently, methods based on generative adversarial networks (GAN) [18] have been applied to sample expansion and data augmentation.Due to their powerful feature extraction and characterization capabilities, the GAN-based methods make the generated samples closer to the real data distribution.Furthermore, Douzas et al. [19] proposed the use of conditionally generated adversarial networks (CGANs) as an oversampling method for binary class imbalance data.Moreover, Gao et al. [20] investigated the Wasserstein generative adversarial network (WGAN) with gradient penalty to generate data where the accuracy of the classifiers has increased on the experimental dataset.Although the above methods are effective in obtaining samples that match the distribution of the original data, most synthesized samples tend to be considered desirable only within a strictly specific space, which omits the effects of trivial but essential features.The main reason for this problem is that GANs are trained based on Euclidean distance, which makes it difficult to effectively portray problems such as correlation and inconsistency in magnitude between features.Additionally, GAN-based sampling algorithms are usually designed to be very complex and many parameters are required to tune which not only leads to the characterization of some redundant feature information and then affects the quality of the synthesized samples, but also leads to overfitting during the model training process.Such an issue therefore motivates us to develop a new oversampling method to bridge such a gap.
Stimulated by the above discussions, our attention is devoted to investigating the problem of diversity and parameter optimization for GAN synthetic samples.Such a problem is seen as nontrivial for the following obvious discerned obstacles: 1) how to assess different attributes fairly and be able to mitigate the interference caused by correlations between features?and 2) how to simultaneously avoid the impact of redundant features on synthetic samples and prevent model overfitting?To cope with these potential obstacles, we make the corresponding contributions in this paper outlined as follows.
• Utilizing the Mahalanobis distance enables impartial evaluation of different attributes without ignoring any trivial/minor but crucial characteristics and alleviates the interference caused by correlation between features.
ℓ 1,2 • Embedding the -norm regularization term into the loss function ensures the sparsity of the data, which greatly facilitates feature filtering and reduces the risk of overfitting.
• Our proposed methods achieve satisfactory outcomes in handling the imbalanced classification, as demonstrated by a comparison with various oversampling approaches and experimental assessments on several public datasets.
The rest of this paper is organized as follows.Section 2 reviews the related work.Section 3 presents the details of the main results.The experiment results and analysis are presented in Section 4. Finally, the conclusions are drawn in Section 5.

Related Work
In this section, some generative adversarial network-based learning methods for imbalanced data are reviewed and introduced.Besides, the commonly used regularization in optimization functions as well as distance are discussed, and the motivation for research based on the above discussion is given in light of the current shortcomings of GANs.

Generating Adversarial Network
Imbalanced learning techniques [21,22] intend to tackle the problem of imbalanced data, in which at least one class of data is much smaller in number than the other classes of data.In general, the minority class tends to have a large impact on many real-world problems, such as cancer detection in medical diagnosis and fault diagnosis in industrial systems.
Existing methods for imbalanced learning mainly encompass: 1) sampling-based methods, which learn the imbalanced classification by oversampling [23,24] the minority class or undersampling [25] the majority class.Representative methods such as SMOTE [26] generate data from existing minority classes.2) cost-sensitive methods [27,28] that employ different cost matrices to compute the cost of any particular data sample.3) kernel-based methods, which use classifiers such as support vector machines (SVMs) [29] to maximize the separation margin.4) GANsbased approaches [30,31], which are analogous to our proposed method using the generator to balance the data class distribution by creating a minority class.Nevertheless, most of the current GAN methods are used to deal with problems in images, with little work using these GAN-based methods for solving classification studies to tackle imbalanced data.

X ′
Generative Adversarial Network is an unsupervised deep learning model, which has been proposed by Goodfellow in the literature [18].GAN is composed of generators and discriminators, where the generator produces fake samples by receiving random noise samples, and the role of the discriminator is to determine whether the generated synthetic samples are true or false.As the discriminator determines that the synthetic samples are more similar to the real data, these samples are augmented into the minority class to constitute a balanced dataset .However, the Euclidean distance used in GAN makes it difficult to deal with problems such as inconsistency of magnitude, which makes the synthesized samples only be considered ideal samples in a harsh spatial range, which in turn causes the synthesized samples to be easy to overlap and neglect essential features.In this regard, how to improve the diversity of the synthesized samples is a research motivation of this paper.

Regularization in Optimization Functions
In optimization problems, to balance the model in terms of complexity and performance, a regularization term is usually added to the objective function.To date, the most popular norms used in the objective function of optimization problems are -norm [32] and -norm [33].For the former, due to the inherent property of the -norm, the solution to an -norm optimization is sparse, hence the -norm is also called the sparse rule operator, and feature sparsity can be achieved by the -norm, thus filtering out some redundant features.Analogously, the -norm is the most prevalent norm in optimizing objective functions.For example, overfitting is another recurrent issue encountered when training on a dataset with a small sample size.To address the phenomenon of overfitting and to improve the generalization ability of the model, the problem can be solved by applying an -norm to the objective function.
As discussed above, models in real-world issues often require both -norm for filtering redundant features and -norm for preventing the model from overfitting to cater to the excessively complicated training set.Therefore, a mixed norm that incorporates both -norm and -norm is a natural choice for this paper.In addition, there is no relevant research on employing generative adversarial network models embedded with mixed norm [34] to deal with the imbalance problem, so this is another research motivation for this paper.

Main Results
In this section, in order to tackle the imbalanced classification problems, we propose a GAN-oriented imbalanced learning method, called IGAN-FF, which incorporates -norm regularizer and Mahalanobis distance.The generator with -norm regularizer is not only effective in improving the robustness of the model and removing redundant features but also in preventing overfitting.In addition, the discriminator is trained to discriminate between real samples and fake (i.e., generated) samples, and also between minority samples and majority samples on the synthetic balanced dataset.It should be noted that, all problems in this paper are addressed by using the Mahalanobis distance to replace the traditional Euclidean distance.

Optimization Based on Mahalanobis Distance without Regularizer
GANs are extensively used as they can generate high-quality data that matches the distribution of the original data with a large amount of training.As shown in formula (1), the essential idea of this model is to design the generator and the discriminator to play a maximal-minimal game.There is a competition between and .The generator tries to learn the distribution of the input data from the random variable and produces a new sample .The discriminator is responsible for distinguishing between and real samples, and the training process of GAN is to run the generator and discriminator repeatedly until the discriminator cannot distinguish between and real data.In contrast to common generative models, GANs can generate high-quality data and do not need to predefine the data distribution of the input samples.Therefore, we select to rebalance the imbalanced data by using GAN to generate more samples of minority classes.The objective function is shown in the following equation: (1) where is the value function and denotes the expectation of the distribution, is real data obeying the distribution and is a noise variable obeying the distribution .It is difficult for GAN to deal with the problem of inconsistency in magnitude by utilizing the Euclidean distance to measure the relationship between samples and features, which limits the spatial range of the synthesized samples and ignores potentially valuable attributes.In this regard, the Mahalanobis distance is used in this paper to

D(x, y)
impartially assess the different attributes because it has the merit of handling samples with inconsistent dimensions.Using Mahalanobis distance for the similarity measurement can effectively solve the interference of correlation between sample features, greatly eliminate the influence of dimension on algorithm processing, and also help detect outliers.The Mahalanobis distance [35] of two samples (denoted as ) can be calculated by where and are two different samples, and are the covariance matrix of these two samples and their inverse matrix.From this, we can see that the covariance matrix requires an inverse.If there is a serious correlation in the variable space, the covariance matrix would be singular, which obviously results in (2) being insolvable.Therefore, when the number of minority class samples is less than the number of features, we need to reduce the dimensionality of the samples.

cov(x, y) −1
The principal component analysis (PCA) [36,37] is a multivariate statistical method used to reduce the dimension of variables.It projects high-dimensional data into low-dimensional principal component space.Therefore, the calculation of the Mahalanobis distance is transferred to the principal component space, and the data dimension is compressed to effectively avoid the problem of matrix singularity.On the premise that has a solution, in order to retain the information of the real data, the number of principal components is guaranteed as much as possible.
-Norm Regularizer -norm is widely used in optimization problems and is considered a very useful tool for solving sparsity problems [32].For example, given a dataset with a large number of features for training, a typical issue encountered is the effect of redundant features on the training results, i.e., the trained model may characterize implicitly noisy and irrelevant features in the data, which reduces the training effectiveness.Therefore, one solution to avoid similar problems is to include a regularization term in the loss function.In this regard, by using the regularization term represented by the -norm in 3, it is possible to make the final desired solution a sparse one by making the value of the -norm as minimal as possible: where denotes a matrix and denotes an instance of the th row and th column of the matrix .Due to the inherent property of the -norm, the solution to an -norm optimization is sparse, and hence the -norm is also called the sparse rule operator, and feature sparsity can be achieved by the -norm, thus filtering out some redundant features.For instance, while classifying a user's movie hobby, the user has 100 features, only a dozen of which may be useful for classification, and most of the features such as height and weight may be irrelevant and can be filtered out by using the -norm.
Analogously, the -norm is the most prevalent norm in optimizing objective functions [33].For example, overfitting is another recurrent issue encountered when training on a dataset with a small sample size.To address the phenomenon of overfitting and to improve the generalization ability of the model, the problem can be solved by applying an -norm to the objective function, which is denoted by the -norm as shown below: Considering that removing redundant features and preventing model overfitting are often needed simultaneously in real problems, a mixed norm that combines -norm and -norm comes naturally to mind.Until now, there have been still only a few studies on -norm regularization for GAN training [38], even without the so-called mixed norms of -norm.In this paper, we tackle the problem of parameterizing a GAN with a suitable -norm regularizer that achieves parameter sparsity and prevents model overfitting by mixing -norm regularizers.Specifically, the -norm is defined as: Θ θ i j i j Θ where is the set of training weights of generator or discriminator, denotes the th entry of .

Improved GAN Model with Regularizer
With the above discussion, we design a GAN that incorporates the Mahalanobis distance and -norm regularizer to solve the problems of inconsistent magnitude between samples, feature redundancy and model overfitting.To be specific, the generator is a multilayer perceptron (MLP) trained to generate realistic data from a random noise .The loss function of is L r f L mi q i ∈ {(real, minority), (real, ma jority), ( f ake, minority)} ŷi where this loss function consists of four terms.The first and second terms are the confusing discriminator loss over the generated minority samples, in which denotes the sample labels, and represents the output (prediction probability) of the discriminator.The third term aims at making the generated minority samples close to the real minority samples, and denote the number of synthetic samples and the number of minority class samples, respectively, and denotes the absolute value of " ".The last term is -norm regularizer, in which is the set of training weights of the generator with regularization coefficient .
x D Similarly, the discriminator is an MLP trained to distinguish real data and generated data from the input.The loss function of is where this loss function consists of four terms.The first term is the cross entropy loss to discriminate whether the sample is generated by a generator or a real sample of the original dataset, denotes the number of majority class samples, and are defined as in (6).The second term is also the cross entropy loss to discriminate whether the sample is a minority class or a majority class.The third term aims at making the different class samples far away from each other.The last term is a regularizer, in which is the set of training weights of the discriminator with regularization coefficient .
Finally, the adversarial training objective function of IGAN-FF is given as equation ( 8): The goal of the generator is to generate fake minority samples to simulate the real minority sample distribution to confuse the discriminator.The goal of the discriminator is to correctly classify between the real training samples and the fake samples generated from the generator, and also between the minority samples and the majority samples.So far, we have discussed the proposed generative adversarial network algorithm embedded with -norm and Mahalanobis distances for imbalanced data, i.e., IGAN-FF.The pseudo-code for rebalancing imbalanced data is given in the following Algorithm 1.The characteristics of the proposed method can be highlighted in the following three advantages: 1) GAN incorporating the Mahalanobis distance can impartially evaluate the different attributes of a sample without ignoring any trivial but crucial attributes and mitigate the interference caused by correlation between attributes; 2) the generator with -norm regularizer is not only effective in improving the robustness of the model and removing redundant features but also in preventing overfitting; and 3) the discriminator can effectively distinguish not only between generated samples and real samples but also between majority samples and minority samples.

Datasets Analysis
The experimental datasets in this paper are derived from the 16 common datasets borrowed from KEEL and UCI [39] machine learning repository to investigate the performance of the proposed method in various scenarios.Considering that the research task of this paper is to test the learning capability of the algorithm in a binary classification assignment, some modifications have been made to the labels of several original datasets with multiple classes.The details of these datasets are shown in Table 1.The column "Minority" indicates the class that is considered as a minority class in the experiment, and the last column "IR%" indicates the ratio of the number of samples in the minority class to the number of samples in the majority class.

Evaluation Metrics
Accuracy is a widely used evaluation criterion in classification, which reflects the number of samples classified correctly.However, accuracy may not be able to judge the minority samples precisely under the imbalanced learning condition.Hence, concerning the imbalanced data, multi-indices are used to evaluate the effectiveness of the proposed method, which include Precision, Recall, F-Score and G-Means [40,41].To give a better explanation, a confusion matrix is defined in Table 2.In Table 2, Class represents minority class, then Class stands for the remainder.TP and FN are separately the numbers of correctly predicted and incorrectly predicted samples in Class Class .TN and FP are the numbers of correctly predicted and incorrectly predicted samples in Class , respectively.
Based on the above confusion matrix, indices borrowed from [17,42] are given to evaluate the proposed method as follows: In experiments, we hope to find sufficiently small values of FP and FN, i.e., FP and FN are expected to be close to 0, then the above five evaluation metrics are expected to be close to 1. Besides, the area under the curve (AUC) of receiver operating characteristic (ROC) is an effective metric to evaluate the performance, especially for imbalanced data [43,44].The closer AUC is to 1, the better the classification performance will be [12,45].

General Setting
All experiments are conducted on a PC server with a 2.5 GHz CPU and 16 GB memory, and all tested models are implemented on Python to check their stability for real usage.The strategy of the 70%−30% train-test setting and 5-fold cross-validations are employed to obtain unbiased results.
In addition, the parameters of IGAN-FF are set corresponding to the experiment.From the perspective of network structure, the experimental experiment uses a three-layer fully connected neural network to construct the generator and discriminator.The numbers of units for the hidden layers of the generator and discriminator are set to 32-256 and 16-128, respectively.The batch size is set to 32 and the number of epochs is 5000 and Rectifying Linear Unit (ReLU) is chosen as its activation function.In the output layer, the sigmoid function is used as the activation function.
To optimize the loss of the generator and discriminator, the Root Mean Square Prop (RMSProp) optimizer is applied to set the learning rate to 0.001.Figure 1 illustrates the training process of IGAN-FF, where the purple and blue lines indicate the changes in the loss values of the generator and the discriminator, respectively.From the figure, we can easily find that the fluctuation of the generator and the discriminator is very obvious before 10000 iterations, and when the number of training times exceeds 20000, the loss of the generator and the discriminator tend to stabilize.Under this condition, the trained IGAN-FF model can output synthetic data more similar to the real original samples.

Simulation Analysis and Discussion
According to the evaluation indicators described in 9-13, the results of the comparison experiments of different evaluation indicators are shown in Tables 3−5, where the best results of each indicator are highlighted in bold.Table 3 presents a comparison between IGAN-FF and 11 other comparative approaches to evaluate classification effectiveness on 16 public imbalanced datasets [39] using F-Score metrics with DNN as the classifier.As can be noticed from the table, the IGAN-FF proposed in this paper performs satisfactorily on 16 datasets, out of which the best performance is obtained on 11 datasets and the second best results are obtained on 2 datasets.However, the proposed method of IGAN-FF classification is not desirable for the datasets Libra, Segment and Judicial, leading to this situation may be due to the fact that the number of minority samples in these three datasets is too scarce to be statistically distributed.In addition, the imbalance rates of these three datasets are high, which are 7.1%, 16.7%, and 7.04%, respectively.It may lead to a large hyperplane bias of the classifiers, and thus the applicability of the proposed method IGAN-FF to such datasets could be improved.Along the same lines, Table 4 exhibits the performance achievement of the proposed method IGAN-FF as compared to 11 diverse approaches in terms of G-Means.Note from Table 4 that IGAN-FF achieved 10 best results, 3 secondary and 3 tertiary results in the 16 datasets, with an average ranking of 1.625, which is 1.063 higher than WGAN, having an average ranking of second.An inference can be drawn from the results in the table that the four algorithms based on deep learning outperform the mean of the four methods based on oversampling in terms of mean as well as average ranking, while the four algorithms based on ensemble learning are significantly inferior to the preceding two methods.The cause of this result may be that although ensemble learning algorithms can handle imbalanced classification problems, their capabilities for dealing with extremely imbalanced classification problems are still limited.
In Table 5, we compare the AUC experimental results of different methods separately.Although IGAN-FF 3.43% 6.85% obtains only 9 optimal and 5 sub-optimal results on 16 datasets, its average ranking is 1.813, which is much higher than the second-ranked WGAN by 1.625.It is worth mentioning that the average rankings based on the four oversampling algorithms, four ensemble algorithms, and four deep learning algorithms are 6.750, 8.360, and 4.188, respectively.Similarly, the average AUC values are 0.904, 0.875, and 0.925, respectively.The proposed method IGAN-FF outperforms the highest-ranked oversampling method MWMOTE by and outperforms the highestranked ensemble algorithm RF by .According to the above discussions, the comparison results demonstrate the effectiveness of our proposed algorithm.
In what follows, Figure 2 illustrates the radargrams of the different approaches on the 16 datasets.From the figure, it is easy to see that the proposed algorithm IGAN-FF is the best in terms of robustness of the three estimators (F-Score, G-Means, AUC).It is also worth noting that as the number of minority samples is too limited, the minority class samples may not have a data distribution or statistical significance, and therefore the discriminator will not be effective during the synthesis process, which then makes it difficult to determine the sample distribution and perform the correct synthesis.As mentioned above, given that the method proposed in this paper is based on generative adversarial neural networks, the number of minority class samples cannot be too small.Next, for a better summary of Tables 3−5, the results of the average values and average rankings of each algorithm in terms of different metrics are displayed in Figure 3. From the figure, it is easy to find that the average value of the proposed method is optimal in terms of F-Score, G-Means and AUC, which sufficiently verifies the robustness and effectiveness of the proposed method.Nevertheless, among the three evaluation metrics, unsatisfactory results can be found for the datasets Pageblock and Segment.The reason for this may be related to the inaccurate transformation of the dataset's unstructured data into structured data, which affects the subsequent classification results.Moreover, the datasets Ecoli4 and Libra also yielded poor results, which may occur as the number of samples in the minority class is excessively sparse and the statistical distribution may not even be available.As a result, the discriminator is incapable of accurately determining whether the synthesized samples are proximal to the original data, which results in the synthesis of a large number of samples that do not match the distribution of the original data.Regarding the analysis of the above results, the method proposed in this paper is more suitable for datasets with a certain number of samples of minority class, that is to say, the algorithm will have satisfactory results only for the samples of minority class with statistical distribution.Finally, to shed further light on whether IGAN-FF is significantly distinct from other algorithms and to ensure that the improvement is statistically significant, the Mann-Whitney Wilcoxon test [50] is employed to validate the conjecture that the proposed method outperforms the other methods, and the -values comparison of all algorithms in F-Score, G-Means and AUC is reported in Table 6.In Table 6, the -values of F-Score, G-mean and AUC of the proposed method IGAN-FF over other compared methods on 16 datasets are described in three rows.Considering the stochastic properties of the computations, a hypothesis test employing such a result is conducted to show whether there is a significant difference between the two algorithms.Without loss of generality, we set the null hypothesis that there is no significant difference between the two algorithms.It is evident that when the significance level is set to be , the values are substantially less than , and thus all values are statistically significant.Hence, by analyzing the results in Table 6, it can be summarized that the proposed algorithm IGAN-FF gets 29 significant results out of the total 33 comparisons of F-Score, G-means and AUC.However, there exist four metrics for which the results are not significant, which may be caused by the fact that all three data-based algorithms and one GAN-based method take into account the data distribution to some extent, resulting in a higher quality of synthesized samples and a better classification performance.In summary, the proposed algorithm IGAN-FF is superior to the other comparative methods and is significant.In this paper, we proposed a GAN-oriented imbalanced learning method, called IGAN-FF, which incorporates -norm regularizer and Mahalanobis distance.To evaluate the different attributes fairly without neglecting any trivial/small-scale but important attributes and also to attenuate the interference caused by correlation between features, GAN incorporating Mahalanobis distance has been adopted to address the issue.Subsequently, the paradigm regularization term has been embedded into the loss function, which ensures the sparsity of the data, greatly facilitates feature filtering, and reduces the risk of overfitting.In the end, experiments on several public datasets have been utilized to show the effectiveness and universal applicability of the proposed methods.It can be observed from the above experimental results that we have developed an effective way by the proposed IGAN-FF strategy to handle the imbalanced data, meanwhile, it has good practical applicability because 1) the datasets adopted in the experiments have covered various fields, such as traffic system, judicial system, mail detection, etc., and 2) the datasets with different scales and imbalanced ratios have been employed to fulfill the comparison experiments with other methods.In future work, we may try to develop a new deep-learning method to fulfill the data imputation without sacrificing much computational cost.Besides, the semi-supervised classification for imbalanced and incomplete data can also be our next work direction.
In future work, we plan to extend the results obtained to handle datasets without data distribution.
Accuracy = T P + T N T P + FN + FP + T N

Figure 2 .
Figure 2. Comparison of classification robustness of different methods(a-p).

Figure 3 .
Figure 3. (a) Average and (b) Ranking of different algorithms for each evaluation metric on 16 datasets.

Table 1
Information of the Datasets

Table 2
Confusion matrix

Table 3
Results of F-Score for IGAN-FF on 16 imbalanced datasets

Table 4
Results of G-Means for IGAN-FF on 16 imbalanced datasets

Table 6
Mann-Whitney Wilcoxon test results of F-Score, G-Means and AUC between IGAN-FF and each of the compared methods