doi: 10.17586/2226-1494-2018-18-6-1084-1090


REINFORCED SEQ2SEQ ADVERSARIAL AUTOENCODER FOR DE NOVO MOLECULAR DESIGN

E. O. Putin


Read the full article  ';
Article in Russian

For citation:

Putin E.O. Reinforced seq2seq adversarial autoencoder for de novo molecular design. Scientific and Technical Journal of Information Technologies, Mechanics and Optics , 2018, vol. 18, no. 6, pp. 1084–1090 (in Russian). doi: 10.17586/2226-1494-2018-18-6-1084-1090



Abstract

Subject of Research.The modern models of deep training for generation of target small organic molecules are studied. The studies were carried out on two datasets of 250,000 drug-like molecular compounds from the ZINC database and 23,000 kinase molecular structures collected manually from the open accessed ChemBL database. Method.We propose the model of a deep neural network based on the concepts of adversarial learning and reinforcement learning. The model controls the molecular validity of the generated structures through the use of a recurrent seq2seq autoencoder and an external generator. The presence of an external generator gives the model flexibility in the choice of architecture, and also allows for the input conditions for the generation. Main Results. Comparative experiments have shown that the proposed model is better than its closest competitors in experiments with pre- and post-training in terms of generating valid and unique molecular structures. Additional chemical analysis of generated structures demonstrates the best quality of the introduced model in comparison with the other competitor models. Practical Relevance.The proposed model can be used by medical chemists as an intelligent assistant for development of new drugs.


Keywords: Subject of Research. The modern models of deep training for generation of target small organic molecules are studied. The studies were carried out on two datasets of 250,000 drug-like molecular compounds from the ZINC database and 23,000 kinase molecular structures collected manually from the open accessed ChemBL database. Method. We propose the model of a deep neural network based on the concepts of adversarial learning and reinforcement learning. The model controls the molecular validity of the generated structures through the use of a recurrent seq2seq autoencoder and an external generator. The presence of an external generator gives the model flexibility in the choice of architecture, and also allows for the input conditions for the generation. Main Results. Comparative experiments have shown that the proposed model is better than its closest competitors in experiments with pre- and post-training in terms of generating valid and unique molecular structures. Additional chemical analysis of generated structures demonstrates the best quality of the introduced model in comparison with the other competitor models. Practical Relevance. The proposed model can be used by medical chemists as an intelligent assistant for development of new drugs.

Acknowledgements. This work was financially supported by the Government of the Russian Federation, Grant 074-U01, and the Russian Foundation for Basic Research, Grant 16-37-60115 mol_a_dk.

References
  1. Holenz J. (eds) Lead Generation: Methods and Strategies. John Wiley & Sons, 2016, vol. 2.
  2. DiMasi J.A., Grabowski H.G., Hansen R.W. Innovation in the pharmaceutical industry: new estimates of R&D costs. Journal of Health Economics, 2016, vol. 47, pp. 20–33. doi: 10.1016/j.jhealeco.2016.01.012
  3. Ivanenkov Y.A. et al. Small-molecule inhibitors of hepatitis C virus (HCV) non-structural protein 5A (NS5A):
    a patent review (2010-2015). Expert Opinion on Therapeutic  Patents, 2017, vol. 27, no. 4, pp. 401–414. doi: 10.1080/13543776.2017.1272573
  4. Schneider G., Fechner U. Computer-based de novo design of drug-like molecules. Nature Reviews Drug Discovery, 2005, vol. 4, no. 8, pp. 649–663. doi: 10.1038/nrd1799
  5. LeCun Y., Bengio Y., Hinton G. Deep learning. Nature, 2015, vol. 521, no. 7553, pp. 436–444. doi: 10.1038/nature14539
  6. Mamoshina P., Vieira A., Putin E., Zhavoronkov A. Applications of deep learning in biomedicine. Molecular
    Pharmaceutics, 2016, vol. 13, no. 5, pp. 1445–1454. doi: 10.1021/acs.molpharmaceut.5b00982
  7. Min S., Lee B., Yoon S. Deep learning in bioinformatics. Briefingsin Bioinformatics, 2017, vol. 18, no. 5, pp. 851–869.
  8. Pastur-Romay L., Cedron F. et al. Deep artificial neural networks and neuromorphic chips for big data analysis:
    pharmaceutical and bioinformatics applications. International Journal of Molecular Sciences, 2016, vol. 17, no. 8, p. 1313. doi: 10.3390/ijms17081313
  9. Zhang L., Tan J., Han D., Zhu H. From machine learning to deep learning: progress in machine intelligence for rational drug discovery.Drug Discovery Today, 2017, vol. 22, no. 11, pp. 1680–1685.doi: 10.1016/j.drudis.2017.08.010
  10. Gawehn E., Hiss J.A., Schneider G. Deep learning in drug discovery. Molecular Informatics, 2016, vol. 35, no. 1, pp. 3–14.
  11. Gupta A., Muller A.T., Huisman B.J.H. et al. Generative recurrent networks for de novo drug design. Molecular Informatics, 2018, vol. 37, no. 1-2. doi: 10.1002/minf.201880141
  12. Yuan W. et al. Chemical space mimicry for drug discovery. Journal of Chemical Information and Modeling, 2017, vol. 57, no. 4, pp. 875–882. doi: 10.1021/acs.jcim.6b00754
  13. Korotcov A., Tkachenko V., Russo D.P., Ekins S. Comparison of deep learning with multiple machine learning methods and metrics using diverse drug discovery data sets. Molecular
    Pharmaceutics
    , 2017, vol. 14, no. 12, pp. 4462–4475. doi: 10.1021/acs.molpharmaceut.7b00578
  14. Olivecrona M., Blaschke T., Engkvist O., Chen H. Molecular de-novo design through deep reinforcement learning.
    Journal of Cheminformatics, 2017, vol. 9, no. 1, p. 48. doi: 10.1186/s13321-017-0235-x
  15. Sanchez-Lengeling B., Outeiral C., Guimaraes G.L., Aspuru-Guzik A. Optimizing distributions over molecular space. An objective-reinforced generative adversarial network for
    inverse-design chemistry (ORGANIC). ChemRxiv. Preprint, 2017. doi: 10.26434/chemrxiv.5309668.v3
  16. Putin E., Asadulaev A., Ivanenkov Y., Aladinskiy V. et al. Reinforced adversarial neural computer for de novo molecular design. Journal of Chemical Information and Modeling, 2018, vol. 58, no. 6, pp. 1194–1204. doi: 10.1021/acs.jcim.7b00690
  17. Putin E., Asadulaev A., Vanhaelen Q., Ivanenkov Y. et al. Adversarial threshold neural computer for molecular de novo
    design. Molecular Pharmaceutics, 2018, vol. 15, no. 10, pp. 4386–4397. doi: 10.1021/acs.molpharmaceut.7b01137
  18. Sutskever I., Vinyals O., Le Q.V. Sequence to sequence learning with neural networks. Advances in Neural Information Processing Systems, 2014.
  19. Goodfellow I., Pouget-Abadie J., Mirza M. et al. Generative adversarial nets. Advances in Neural Information Processing Systems, 2014, pp. 2672–2680.
  20. Weininger D. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. Journal of Chemical Information and Computer Sciences, 1988, vol. 28, no. 1, pp. 31–36. doi: 10.1021/ci00057a005
  21. Williams R.J. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning, 1992, vol. 8, no. 3-4, pp. 229–256. doi: 10.1007/bf00992696
  22. Makhzani A., Shlens J., Jaitly N. et al. Adversarial autoencoders. arXiv preprint, 2015, arXiv:1511.05644
  23. Gaulton A., Bellis L.J., Bento A.P. et al. ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic
    Acids Research
    , 2011, vol. 40, no. D1, pp. D1100-D1107. doi: 10.1093/nar/gkr777
  24. Irwin J.J., Shoichet B.K. ZINC − A free database of commerciallyavailable compounds for virtual screening. Journal of Chemical Information and Modeling, 2005, vol. 45, no. 1, pp. 177–182.doi:10.1021/ci049714+


Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License
Copyright 2001-2024 ©
Scientific and Technical Journal
of Information Technologies, Mechanics and Optics.
All rights reserved.

Яндекс.Метрика