Menu
Publications
2024
2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
Editor-in-Chief
Nikiforov
Vladimir O.
D.Sc., Prof.
Partners
doi: 10.17586/2226-1494-2020-20-4-532-538
MODERN APPROACHES TO MULTICLASS INTENT CLASSIFICATION BASED ON PRE-TRAINED TRANSFORMERS
Read the full article ';
Article in English
For citation:
Abstract
For citation:
Solomin A.A., Ivanova Yu.A. Modern approaches to multiclass intent classification based on pre-trained transformers. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2020, vol. 20, no. 4, pp. 532–538 (in English). doi: 10.17586/2226-1494-2020-20-4-532-538
Abstract
Subject of Research. The paper considers modern approaches to the multiclass intention classification problem. The user intention is the incoming user requests when interacting with voice assistants and chatbots. The algorithm is meant for determination what class the call belongs to. Modern technologies such as transfer learning and transformers can improve significantly the multiclass classification results. Method. This study uses a comparative model analysis technique. In turn, each model is inlined into a common pipeline for data preparing and clearing, and the model training but with regard to its specific requirements. The following models applied in real projects have been selected for comparison: Logistic Regression + TF-IDF, Logistic Regression + FastText, LSTM + FastText, Conv1D + FastText, BERT, and XLM. The sequence of models corresponds to their historical origin, but in practice these models are used without regard to the time period of their creation but depending on the effectiveness of the problem being solved. Main Results. The effectiveness of the multiclass classification models on real data is studied. Comparison results of modern practical approaches are described. In particular, XLM confirms the superiority of transformers over other approaches. An assumption is made considering the reason why the transformers show such a gap. The advantages and disadvantages of modern approaches are described. Practical Relevance. From a practical point of view, the results of this study can be used for projects that require automatic classification of intentions, as part of a complex system (voice assistant, chatbot or other system), as well as an independent system. The pipeline designed during the study can be applied for comparison and selection of the most effective model for specific data sets, both in scientific research and production.
Keywords: natural language processing, text classification, transfer learning, transformers
Acknowledgements. The reported study was funded by the RFBR according to the research project No.18-08-00977 А. The work was partially supported by the Innovation Promotion Fund under the “UMNIK” program.
References
Acknowledgements. The reported study was funded by the RFBR according to the research project No.18-08-00977 А. The work was partially supported by the Innovation Promotion Fund under the “UMNIK” program.
References
1. Ruder S., Peters M.E., Swayamdipta S., Wolf T. Transfer learning in natural language processing. Proc. of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT 2019), 2019, pp. 15–18. doi: 10.18653/v1/N19-5004
2. Devlin J., Chang M.-W., Lee K., Toutanova K. Bert: Pre-training of deep bidirectional transformers for language understanding. Proc. Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT 2019), 2019, pp. 4171–4186. doi: 10.18653/v1/N19-1423
3. Vaswani A., Shazeer N., Parmar N., Uszkoreit J., Jones L., Gomez A.N., Kaiser Ł., Polosukhin I. Attention is all you need. Proc. 31st Annual Conference on Neural Information Processing Systems (NIPS 2017), 2017, pp. 5999–6009.
4. Wang A., Singh A., Michael J., Hill F., Levy O., Bowman S.R. GLUE: A multi-task benchmark and analysis platform for natural language understanding. Proc. 7th International Conference on Learning Representations (ICLR 2019), 2019.
5. Wang A., Pruksachatkun Y., Nangia N., Singh A., Michael J., Hil F., Levy, O., Bowman S.R. SuperGLUE: A stickier benchmark for general-purpose language understanding systems. Advances in Neural Information Processing Systems, 2019, vol. 32.
6. Chollet F. On the measure of intelligence. Available at: https://arxiv.org/pdf/1911.01547.pdf (accessed: 20.04.20)
7. Conneau A., Lample G. Cross-lingual language model pretraining. Advances in Neural Information Processing Systems, 2019, vol. 32.
8. Mikolov T., Sutskever I., Chen K., Corrado G., Dean J. Distributed representations of words and phrases and their compositionality. arXiv:1310.4546, 2013.
9. Kim Y. Convolutional neural networks for sentence classification. Proc. of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2014), 2014, pp.1746–1751. doi: 10.3115/v1/D14-1181
10. Mikolov T., Karafiát M., Burget L., Cernocky J., Khudanpur S. Recurrent neural network based language model. Proc. 11th Annual Conference of the International Speech Communication Association (INTERSPEECH 2010), 2010, pp. 1045–1048.
11. Vasilev I. Advanced Deep Learning with Python: Design and implement advanced next-generation AI solutions using TensorFlow and PyTorch. Packt Publishing Ltd, 2019, pp. 260–264.