doi: 10.17586/2226-1494-2020-20-6-888-892


INDEPENDENT COMPONENT ANALYSIS FOR INITIAL APPROXIMATION DETERMINATION IN IDENTIFICATION OF ACTIVE MODULES IN BIOLOGICAL GRAPHS

A. N. Gainullina, V. D. Sukhov, A. A. Shalyto, A. A. Sergushichev


Read the full article  ';
Article in Russian

For citation:

Gainullina A.N., Sukhov V.D., Shalyto A.A., Sergushichev A.A. Independent component analysis for initial approximation determination in identification of active modules in biological graphs. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2020, vol. 20, no. 6, pp.  888-892 (in Russian). doi: 10.17586/2226-1494-2020-20-6-888-892



Abstract

Subject of Research.The identification of active modules in biological graphs, for example, gene graphs, is one of the important approaches to the interpretation of experimental biological data. One of the approaches for its solution is the application of an algorithm of the joint clustering in network and correlation spaces. The algorithm finds groups of genes that are located simultaneously close in the gene graph and have a high pairwise correlation according to the matrix of gene expression values. The algorithm is iterative and one of its key parameters is the chosen initial approximation, which affects both the run time and the quality of the results. We consider the determination problem of an initial approximation for this algorithm. A procedure based on independent component analysis is proposed for the problem solution. Method. The method of independent component analysis is applied to a centered matrix of expression values at the first step of the proposed procedure for finding of an initial approximation. Then, the genes specific to the component with a given level of statistical significance are identified for each component. The gene groups obtained for all independent components are chosen as the initial approximation. Main Results. The procedure application based on the independent component analysis reduces the number of gene groups in the initial approximation without the loss of accuracy. This fact, in turn, speeds up the running time of the clustering algorithm by an order of magnitude with the quality maintenance of the results. Practical Relevance. Acceleration of the algorithm of the joint clustering in network and correlation spaces without quality loss of the results increases significantly its convenience and simplifies its application for the interpretation of transcriptome data in bioinformatics and computational biology.


Keywords: clustering, correlation, independent component analysis, graphs, gene expression

Acknowledgements. This work was supported by the Government of the Russian Federation, Investigation Research Grant 08-08.

References
1. Beisser D., Grohme M.A., Kopka J., Frohme M., Schill R.O., Hengherr S., Dandekar T., Klau G.W., Dittrich M., Müller T. Integrated pathway modules using time-course metabolic profiles and EST data from Milnesium tardigradum. BMC Systems Biology, 2012, vol. 6, pp. 72. doi: 10.1186/1752-0509-6-72
2. Jha A.K., Huang S.-C., Sergushichev A., Lampropoulou V., Ivanova Y., Loginicheva E., Chmielewski K., Stewart K., Ashall J., Everts B., Pearce E., Driggers E.M., Artyomov M.N. Network integration of parallel metabolic and transcriptional data reveals metabolic modules that regulate macrophage polarization. Immunity, 2015, vol. 42, no. 3, pp. 419–430. doi: 10.1016/j.immuni.2015.02.005
3. Artyomov M.N., Sergushichev A., Schilling J.D. Integrating immunometabolism and macrophage diversity. Seminars in Immunology, 2016, vol. 28, no. 5, pp. 417–424. doi: 10.1016/j.smim.2016.10.004
4. Loboda A.A., Artyomov M.N., Sergushichev A.A. Solving generalized maximum-weight connected subgraph problem for network enrichment analysis. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2016, vol. 9838, pp. 210–221. doi: 10.1007/978-3-319-43681-4_17
5. Gainullina A.N., Shalyto A.A., Sergushichev A.A. Method of the joint clustering in network and correlation spaces. Modeling and Analysis of Information Systems, 2020, vol. 27, no. 2, pp. 180–193. (in Russian). doi: 10.18255/1818-1015-2020-2-180-193
6. Comon P. Independent component analysis, a new concept? Signal Processing, 1994, vol. 36, no. 3, pp. 287–314. doi: 10.1016/0165-1684(94)90029-9
7. Saelens W., Cannoodt R., Saeys Y. A comprehensive evaluation of module detection methods for gene expression data. Nature Communications, 2018, vol. 9, no. 1, pp. 1090. doi: 10.1038/s41467-018-03424-4
8. Rotival M., Zeller T., Wild P., Maouche S., Szymczak S., Schillert A., Castagné R., Deiseroth A., Proust C., Brocheton J., Godefroy T., Perret C., Germain M., Eleftheriadis M., Sinning C.R., Schnabel R.B., Lubos E., Lackner K.J., Rossmann H., Münzel T., Rendon A., Consortium C., Erdmann J., Deloukas P., Hengstenberg C., Diemert P., Montalescot G., Ouwehand W.H., Samani N.J., Schunkert H., Tregouet D.-A., Ziegler A., Goodall A.H., Cambien F., Tiret L., Blankenberg S. Integrating genome-wide genetic variations and monocyte expression data reveals trans-regulated gene modules in humans. PLoS Genetics, 2011, vol. 7, no. 12, pp. e1002367. doi: 10.1371/journal.pgen.1002367
9. Minka T. Automatic choice of dimensionality for PCA. Advances in Neural Information Processing Systems, 2001, vol. 13, pp. 598–604.
10. Ray K.L., McKay D.R., Fox P.M., Riedel M.C., Uecker A.M., Beckmann C.F., Smith S.M., Fox P.T., Laird A.R. ICA model order selection of task co-activation networks. Frontiers in Neuroscience, 2013, vol. 7, pp. 237. doi: 10.3389/fnins.2013.00237
11. Steinbaugh M.J., Pantano L., Kirchner R.D., Barrera V., Chapman B.A., Piper M.E., Mistry M., Khetani R.S., Rutherford K.D., Hofmann O., Hutchinson J.N., Sui S.H. BcbioRNASeq: R package for bcbio RNA-seq analysis. F1000Research, 2017, vol. 6, pp. 1976. doi: 10.12688/f1000research.12093.1


Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License
Copyright 2001-2024 ©
Scientific and Technical Journal
of Information Technologies, Mechanics and Optics.
All rights reserved.

Яндекс.Метрика