Menu
Publications
2025
2024
2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
Editor-in-Chief

Nikiforov
Vladimir O.
D.Sc., Prof.
Partners
doi: 10.17586/2226-1494-2024-24-4-654-660
Multilevel splitting for rare events estimation in permutation tests
Read the full article

Article in Russian
For citation:
Abstract
For citation:
Sukhov V.D., Korotkevich G.V., Sergushichev A.A. Multilevel splitting for rare events estimation in permutation tests. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2024, vol. 24, no. 4, pp. 654–660 (in Russian). doi: 10.17586/2226-1494-2024-24-4-654-660
Abstract
Permutation tests are widely employed in statistical analysis, especially when the assumptions of parametric tests are violated, or the data distribution is unknown. However, classical permutation tests encounter challenges when attempting to estimate the probabilities of rare events with high relative accuracy, leading to difficulties in applying corrections for multiple hypothesis testing. In this study, we propose an original method for estimating arbitrarily small P-values in permutation tests, which is based on multilevel splitting for Monte Carlo method. The proposed method involves splitting the original permutation space into non-overlapping levels based on the statistic values. This approach allows the problem of estimating the original probability of a rare event to be reduced to estimating ordinary conditional probabilities for each level. Utilizing such an approach enables efficient estimation of the desired P-values while maintaining a balance between computation time and the level of relative error. The efficacy of the method is demonstrated in its application to the task of estimating arbitrary P-values in the two-sample Kolmogorov-Smirnov test. Comparing the method results with true P-values has shown practical convergence of the method. Moreover, examples of the superiority of the proposed method over alternative asymptotic approaches have been provided. Thus, the proposed method shows significant potential for application across a broad spectrum of scientific fields, such as systems biology, immunology, and others. Furthermore, the method can be adapted for use in various statistical analysis scenarios that require handling probabilities of rare events in permutation tests.
Keywords: statistical hypothesis test, P-value, Monte Carlo method, permutation test, rare events
References
References
- Good P. Permutation Tests: A Practical Guide to Resampling Methods for Testing Hypotheses. Springer Science & Business Media, 2013.
- Pesarin F., Salmaso L. Permutation Tests for Complex Data: Theory, Applications and Software. John Wiley & Sons, 2010, 448 p.
- Hammersley J. Monte Carlo Methods. Springer Science & Business Media, 2013, 178 p.
- Kalos M.H., Whitlock P.A. Monte Carlo Methods. John Wiley & Sons, 2009, 215 p.
- Trendelkamp-Schroer B., Noé F. Efficient estimation of rare-event kinetics. Physical Review X, 2016, vol. 6, no. 1, pp. 011009. https://doi.org/10.1103/physrevx.6.011009
- Lestang T., Ragone F., Bréhier C.-E., Herbert C., Bouchet F. Computing return times or return periods with rare event algorithms. Journal of Statistical Mechanics: Theory and Experiment, 2018, vol. 2018, no. 4, pp. 043213. https://doi.org/10.1088/1742-5468/aab856
- Caron V., Guyader A., Zuniga M.M., Tuffin B. Some recent results in rare event estimation. ESAIM: Proceedings, 2014, vol. 44, pp. 239–259. https://doi.org/10.1051/proc/201444015
- L'Ecuyer P., Demers V., Tuffin B. Splitting for rare-event simulation. Proc. of the 2006 Winter Simulation Conference, 2006, pp. 137–148. https://doi.org/10.1109/wsc.2006.323046
- Glasserman P., Heidelberger P., Shahabuddin P., Zajic T. Multilevel splitting for estimating rare event probabilities. Operations Research, 1999, vol. 47, no. 4, pp. 585–600. https://doi.org/10.1287/opre.47.4.585
- Botev Z.I., Kroese D.P. An efficient algorithm for rare-event probability estimation, combinatorial optimization, and counting. Methodology and Computing in Applied Probability, 2008, vol. 10, no. 4, pp. 471–505. https://doi.org/10.1007/s11009-008-9073-7
- Metropolis N., Rosenbluth A.W., Rosenbluth M.N., Teller A.H., Teller E. Equation of state calculations by fast computing machines. The Journal of Chemical Physics, 1953, vol. 21, no. 6, pp. 1087–1092. https://doi.org/10.1063/1.1699114
- Hastings W.K. Monte Carlo sampling methods using Markov chains and their applications. Biometrika, 1970, vol. 57, no. 1, pp. 97–109. https://doi.org/10.2307/2334940
- Kolmogorov A. Sulla determinazione empirica di una legge di distribuzione. Giornale dell’Istituto Italiano degli Attuari, 1933, vol. 4, pp. 83–91.
- Smirnoff N. Sur les Écarts de la courbe de distribution empirique. Matematicheskii Sbornik, 1939, vol. 48, no. 1, pp. 3–26.
- Virtanen P., Gommers R., Oliphant T.E., Haberland M., Reddy T., Cournapeau D., Burovski E., Peterson P., Weckesser W., Bright J. et. al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nature Methods, 2020, vol. 17, no. 3, pp. 261–272. https://doi.org/10.1038/s41592-019-0686-2