Menu
Publications
2025
2024
2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
Editor-in-Chief
Nikiforov
Vladimir O.
D.Sc., Prof.
Partners
doi: 10.17586/2226-1494-2025-25-5-999-1001
Probabilistic matrix clustering with feature priors for unbiased control selection
Read the full article
Article in Russian
For citation:
Abstract
For citation:
Usoltsev D.A. Probabilistic matrix clustering with feature priors for unbiased control selection. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2025, vol. 25, no. 5, pp. 999–1001 (in Russian). doi: 10.17586/2226-1494-2025-25-5-999-1001
Abstract
We propose a probabilistic matrix-clustering method that leverages a prior distribution of features and dimensionality reduction (Singular Value Decomposition, SVD). The approach identifies, within a large control pool, a cluster statistically comparable to the test cohort, thereby reducing systematic bias in downstream comparative analyses. We show that the method correctly selects control groups in scenarios where standard nearest-neighbor matching produces false positives. The method has been used to construct control groups in studies based on the Russian Biobank at the Almazov National Medical Research Centre (Ministry of Health of the Russian Federation).
Keywords: matrix clustering, SVD, prior (feature) distribution, Mahalanobis distance, χ²-criterion
References
References
-
Artomov M., Loboda A.A., Artyomov M.N., Daly M.J. Public platform with 39,472 exome control samples enables association studies without genotype sharing. Nature Genetics, 2024, vol. 56, no. 2, pp. 327–335. https://doi.org/10.1038/s41588-023-01637-y
-
Pearce N. Analysis of matched case-control studies. BMJ Online, 2016, vol. 352, pp. i969. https://doi.org/10.1136/bmj.i969
-
Ghosh A., Ghosh A.K., SahaRay R., Sarkar S. Classification using global and local Mahalanobis distances. Journal of Multivariate Analysis, 2025, vol. 207, pp. 105417. https://doi.org/10.1016/j.jmva.2025.105417
-
Brunton S.L., Kutz J.N. Singular Value Decomposition (SVD). Data-Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control, 2019, pp. 3–46. https://doi.org/10.1017/9781108380690.002
-
Rovetta A. Raiders of the lost correlation: a guide on using pearson and spearman coefficients to detect hidden correlations in medical sciences. Cureus, 2020, vol. 12, no. 11, pp. e11794. https://doi.org/10.7759/cureus.11794
-
Wang Z., Li G., Hu F., Chi N. Toeplitz concatenated matrix aided ICA algorithm for super-Nyquist multiband CAP VLC systems. Optics Express, 2020, vol. 28, no. 20, pp. 29876–29894. https://doi.org/10.1364/OE.404925
-
Tolkunova K., Usoltsev D., Moguchaia E., Boyarinova M., Kolesova E., Erina A., et al. Transgenerational and intergenerational effects of early childhood famine exposure in the cohort of offspring of Leningrad Siege survivors. Scientific Reports, 2023, vol. 13, no. 1, pp. 11188. https://doi.org/10.1038/s41598-023-37119-8

