Bioequivalence (BE) studies are clinical trials essential in demonstrating therapeutic equivalence across generic and reference medicinal products. Unfortunately, the lack of large samples and broad confidence intervals (CIs) limits their effectiveness. We present a hybrid data augmentation framework in this study that merges actual clinical and synthetic pharmacokinetic (PK) data generation with Wasserstein Generative Adversarial Networks (WGANs). Three randomized, single-dose, 2 × 2 crossover BE datasets (lisinopril, amlodipine, and aceclofenac) were used to develop WGAN-based analyses to directly convert them to virtual subject data. Hybrid datasets were generated by pooling real and synthetic data in fixed proportions while keeping a constant real sample amount. The hybrid datasets were analysed with respect to baseline BE metrics, including geometric mean ratio (GMR), 90% CI, as well as within-subject variability over PK data components, including AUC, Cmax, and the recently proposed average slope (AS). Hybrid datasets achieved a significant reduction in 90% CI widths, with an average reduction of up to 6.7% across all drugs and parameters. For example, in the case of aceclofenac and Cmax, the hybrid 50–150 model reduced the width by approximately 59.8%, decreasing from 18.86% to 7.59%. These results suggest that WGAN-based hybrid datasets can improve the statistical robustness and reproducibility of BE assessments if used as supporting evidence. Whilst clinical data needs to be the foundation for regulatory decisions, hybrid data may be useful for study design minimization and design, sensitivity assessment, and uncertainty reduction particularly in areas where large-scale recruitment is not feasible or an ethically untenable condition.



