The comparison of treatment and control groups is a cornerstone of biomedical, psychological, and public health research. Practitioners often rely on standard two-sample tests, such as Student’s t-test for location and the F-ratio test for scale, to evaluate the effects of interventions. When data adhere to the assumption of normality, these parametric tests are among the most powerful available. However, this assumption is frequently violated in practice, particularly with small sample sizes where distributional properties are difficult to verify. In such scenarios, nonparametric tests are the recommended alternative due to their robustness and minimal distributional requirements (Mukherjee and Marozzi, 2019; Kumar et al., 2024).
Among nonparametric procedures, the Wilcoxon-Mann-Whitney (WMW) U-test is widely used for comparing the locations of two independent samples (Lin et al., 2021), while the Ansari-Bradley test is a common choice for assessing differences in scale (Mukherjee and Marozzi, 2019; Murakami and Neuhauser, 2025; Hussain, 2025). A significant ethodological challenge arises when a treatment effect manifests not as a pure location or scale shift, but as a combination of both—a scenario frequently encountered in practice (Murakami and Neuh¨auser, 2025). For instance, in a clinical trial for chronic obstructive pulmonary disease, the therapeutic effect is characterized by simultaneous changes in both location and scale parameters (Neuh¨auser, 2001). To address this, omnibus tests sensitive to any departure from the null hypothesis of identical distributions have been developed. A prominent distribution-free solution is the Lepage test (Lepage, 1971), which combines the standardized WMW U-test and Ansari-Bradley C-test statistics into a single quadratic form.
A critical, yet often overlooked, aspect of this problem is the prevalence of right-skewed data in many scientific fields. Outcome variables such as reaction times, biomarker concentrations, survival duration, and cost data frequently exhibit positive skewness (Fagerland, 2009). This distributional characteristic poses a particular challenge for conventional tests, whose power can be substantially diminished when the underlying populations are skewed and heteroscedastic. The standard Lepage test, while distribution-free, relies on the strong null hypothesis of identical population distributions (Lepage, 1971). This assumption is often untenable in the presence of heteroscedasticity, which itself may be an indicator of a meaningful treatment effect.
This limitation has motivated the development of tests for the weak null hypothesis, which concerns the equality of a relative effect size rather than the equality of the entire distribution (Fong and Huang, 2019). For instance, the (WMW) U-test is not directly applicable for testing this weak null, as the sampling distribution of its test statistic ceases to be distribution-free under this condition (Chung and Romano, 2013, 2016). In response to this limitation, Fligner and Policello (1981) developed an alternative to the U-test that relies on fewer assumptions concerning the underlying distributional forms of the populations. A parallel issue compromises the Ansari-Bradley test for scale differences, as its validity is contingent upon the assumption of equal medians—a condition that is often untenable and frequently incompatible with the location-scale alternatives under investigation. Modern robust methods (Kossler and Mukherjee, 2020) and permutation-based approaches (Chung and Romano, 2016) offer additional avenues for addressing heteroscedasticity and distributional asymmetry. However, a systematic evaluation of how these modern components perform when integrated into the Lepage framework—particularly for right-skewed data—remains lacking in the literature.
This study provides a comprehensive evaluation of enhanced Lepage-type test statistics that combine modern robust procedures with the classical Lepage framework. Rather than proposing fundamentally new methodology, our contribution lies in the systematic combination and evaluation of existing robust components for the specific challenge of right-skewed data. We integrate the Fligner-Policello test (Fligner and Policello, 1981) and its refined Fong-Huang variance estimator (Fong and Huang, 2019) for the location component, alongside a novel empirical variance estimator for the Ansari-Bradley statistic that relaxes the assumption of equal medians. By embedding these robustified components into the Lepage framework, we construct tests that are valid under weaker assumptions and exhibit enhanced sensitivity to the distributional shapes frequently encountered in practical applications.
The primary objectives of this study are threefold: first, to systematically evaluate the performance of various combinations of robust components within the Lepage framework; second, to provide practical guidance for test selection with right-skewed data across different sample size scenarios; and third, to demonstrate the practical utility of these methods through both extensive simulations and real biomedical applications. Our focus remains on right-skewed distributions commonly encountered in biomedical research, while acknowledging the need for further investigation with other distributional shapes.