Extreme class imbalance is a persistent obstacle for machine learning-driven intrusion detection, as rare but high-impact cyberattacks occur far less frequently than benign traffic in training data. In many real-world cybersecurity datasets, this imbalance becomes extreme, with certain attack types containing a handful of samples, effectively placing the problem in a few-shot learning regime. This paper presents a controlled benchmarking study of Generative Adversarial Network (GAN) objectives for synthesizing minority-class cyberattack data. Using the UWF-ZeekData22 network traffic dataset, each MITRE ATT&CK tactic is framed as a separate binary detection task, and tactic-specific GANs are trained solely on minority samples to generate synthetic attack records. Four widely used GAN variants—Vanilla GAN, Conditional GAN (cGAN), Wasserstein GAN (WGAN), and Wasserstein GAN with Gradient Penalty (WGAN-GP)—are compared under unified training steps and fixed augmentation conditions. The utility of generated data is assessed by evaluating downstream detection performance using five traditional classifiers: Logistic Regression, Support Vector Machine, k-Nearest Neighbors, Decision Tree, and Random Forest. The results indicate that GAN augmentation generally strengthens minority-class detection across tactics and models, reducing false negatives and improving recall consistency, while not systematically harming majority-class performance. However, the effectiveness of each GAN objective varies significantly with data sparsity. Specifically, simpler adversarial objectives often outperform more complex architectures by preserving discriminative feature structure, while heavily regularized models may overly smooth minority-class distributions and reduce separability. Wasserstein-based objectives provide improved training stability, but additional regularization does not consistently translate to better detection performance. Overall, the results demonstrate that in extreme-imbalance settings, GAN effectiveness is governed more by data sparsity and structure preservation than by architectural complexity. These findings establish class-specific generative augmentation as a practical strategy for intrusion detection and provide empirical guidance for selecting appropriate GAN objectives for tabular cybersecurity data under highly imbalanced conditions.
Files and links (1)
url
Class-Specific GAN Augmentation for Imbalanced Intrusion Detection: A Comparative Study Using the UWF-ZeekData22 DatasetView
Published (Version of record) link to article Open CC BY V4.0
Related links
Details
Title
Class-Specific GAN Augmentation for Imbalanced Intrusion Detection
Publication Details
Future internet, Vol.18(4), p.200
Resource Type
Journal article
Publisher
MDPI
Number of pages
37
Grant note
Askew Institute at the University of West Florida
This research was also partially supported by the Askew Institute at the University of West Florida.