DEEP LEARNING MODEL FOR BILINGUAL SENTIMENT 
CLASSIFICATION OF SHORT TEXTS

Abdullin  Yelaman  B., Ivanov Vladimir V

2017 , VOLUME 17, NUMBER 1 ( January–February )

ISSN 2226-1494 (print), ISSN 2500-0373 (online)

Publications

Editor-in-Chief

Nikiforov
Vladimir O.
D.Sc., Prof.

Partners

doi: 10.17586/2226-1494-2017-17-1-129-136

DEEP LEARNING MODEL FOR BILINGUAL SENTIMENT CLASSIFICATION OF SHORT TEXTS

Y. B. Abdullin, V. V. Ivanov

Read the full article

Article in English

For citation: Abdullin Y.B., Ivanov V.V. Deep learning model for bilingual sentiment classification of short texts. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2017, vol. 17, no. 1, pp. 129–136. doi: 10.17586/2226-1494-2017-17-1-129-136

Abstract

Sentiment analysis of short texts such as Twitter messages and comments in news portals is challenging due to the lack of contextual information. We propose a deep neural network model that uses bilingual word embeddings to effectively solve sentiment classification problem for a given pair of languages. We apply our approach to two corpora of two different language pairs: English-Russian and Russian-Kazakh. We show how to train a classifier in one language and predict in another. Our approach achieves 73% accuracy for English and 74% accuracy for Russian. For Kazakh sentiment analysis, we propose a baseline method, that achieves 60% accuracy; and a method to learn bilingual embeddings from a large unlabeled corpus using a bilingual word pairs.

Keywords: sentiment analysis, bilingual word embeddings, recurrent neural networks, deep learning, Kazakh language

Acknowledgements. This work is supported by the Russian Science Foundation (project 15-11-10019 ”Text mining models and methods for analysis of the needs, preferences and consumer behaviour”. The authors thank the Everware team (https://github.com/orgs/everware/people) for access to their platform. The authors would like to thank Yerlan Seitkazinov, Zarina Sadykova and Aliya Sitdikova for manual annotation of the Kazakh sentiment corpus.

References

1. Jansen B.J., Zhang M., Sobel K., Chowdury A. Twitter power: Tweets as electronic word of mouth. Journal of the American Society for Information Science and Technology, 2009. vol. 60, no. 11, pp. 2169–2188. doi: 10.1002/asi.21149

2. Chew C., Eysenbach G. Pandemics in the age of twitter: content analysis of tweets during the 2009 H1N1 outbreak. PloS One, 2010, vol. 5, no. 11, art. e14118. doi: 10.1371/journal.pone.0014118

3. Paul M.J., Dredze M. You are what you tweet: analyzing twitter for public health. ICWSM, 2011, vol. 20, pp. 265–272.

4. Bengio Y., Schwenk H., Senecal J.-S., Morin F., Gauvain J.-L. Neural probabilistic language models. Innovations in Machine Learning, 2006, vol. 194, pp. 137–186. doi: 10.1007/3-540-33486-6_6

5. Collobert R., Weston J. A unified architecture for natural language processing: deep neural networks with multitask learning. Proc. 25^th Int. Conf. on Machine Learning, 2008, pp. 160–167. doi: 10.1145/1390156.1390177

6. Mikolov T., Chen K., Corrado G., Dean J. Efficient estimation of word representations in vector space. Proceedings of Workshop at ICLR, 2013.

7. Pennington J., Socher R., Manning C.D. Glove: global vectors for word representation. Proc. Conf. on Empirical Methods in Natural Language Processing EMNLP, 2014, vol. 14, pp. 1532–1543. doi: 10.3115/v1/d14-1162

8. Zou W.Y., Socher R., Cer D.M., Manning C.D. Bilingual word embeddings for phrase-based machine translation. EMNLP, 2013, pp. 1393–1398.

9. Manning C.D., Raghavan P., Schütze H. et al. Introduction to Information Retrieval. Cambridge University Press, 2008, vol. 1, no. 1.

10. dos Santos C.N., Gatti M. Deep convolutional neural networks for sentiment analysis of short texts. Proc. 25^th Int. Conf. on Computational Linguistics. Dublin, Ireland, 2014, pp. 69–78.

11. Vulic I., Moens M.-F. Bilingual word embeddings from non- parallel document-aligned data applied to bilingual lexicon induction. Proc. 53^rd Annual Meeting of the Association for Computational Linguistics AC, 2015. doi: 10.3115/v1/p15-2118

12. Lu A., Wang W., Bansal M., Gimpel K., Livescu K. Deep multilingual correlation for improved word embeddings. Proc. Annual Conference of the North American Chapter of the ACL, NAACL. Denver, Colorado, 2015, pp. 250–256. doi: 10.3115/v1/n15-1028

13. Mohammad S.M., Kiritchenko S., Zhu X. Nrc-canada: Building the state-of-the-art in sentiment analysis of tweets. arXiv preprint, 2013, arXiv:1308.6242.

14. Mikolov T., Le Q.V., Sutskever I. Exploiting similarities among languages for machine translation. arXiv preprint, 2013. arXiv:1309.4168.

15. Hochreiter S., Schmidhuber J. Long short-term memory. Neural Computation, 1997, vol. 9, no. 8, pp. 1735–1780. doi: 10.1162/neco.1997.9.8.1735

16. Graves A. Supervised Sequence Labelling with Recurrent Neural Networks. Springer, 2012, 146 p.doi: 10.1007/978-3-642-24797-2

17. Olah C. Understanding LSTM networks. 2015. Available at: http://colah.github.io/posts/2015-08-Understanding-LSTMs (accessed: 30.11.16).

18. Cho K., van Merriënboer B., Bahdanau D., Bengio Y. On the properties of neural machine translation: encoder-decoder approaches. Proc. Workshop on Syntax Semantics and Structure in Statistical Translation, 2014.

19. Tieleman T., Hinton G. Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude. COURSERA: Neural Networks for Machine Learning, 2012, vol. 4, p. 2.

20. Dauphin Y.N., de Vries H., Chung J., Bengio Y. Rmsprop and equilibrated adaptive learning rates for non-convex optimization. arXiv preprint, 2015, arXiv:1502.04390.

21. Srivastava N., Hinton G., Krizhevsky A., Sutskever I., Salakhutdinov R. Dropout: A simple way to prevent neural networks from overfittng. The Journal of Machine Learning Research, 2014, vol. 15, no. 1, pp. 1929–1958.

22. Go A., Bhayani R., Huang L. Twitter sentiment classification using distant supervision. Technical ReportCS224N. Stanford, 2009, vol. 1, p. 12.

23. Rubtsova Y.V., Zagorulko Y.A. An approach to construction and analysis of a corpus of short Russian texts intended to train a sentiment classifier. The Bulletin of NCC, 2014, vol. 37, pp. 107–116.

24. Google. Tool for computing continuous distributed representations of words. Available at: https://code.google.com/p/word2vec (accessed: 30.11.16).

25. Makhambetov O., Makazhanov A., Yessenbayev Z., Matkarimov B., Sabyrgaliyev I., Sharafudinov A. Assembling the kazakh language corpus. Proc. Conference on Empirical Methods in Natural Language Processing. Seattle, Washington, 2013, pp. 1022–1031.

26. Van der Maaten L., Hinton G. Visualizing data using t-sne. Journal of Machine Learning Research, 2008, vol. 9, pp. 2579–2605.

27. Chollet F. Keras: Theano-based deep learning library. 2015. Available at: https://github.com/fchollet (accessed: 30.11.16).

28. Bergstra J., Breuleux O., Bastien F., Lamblin P., Pascanu R., Desjardins G., Turian J., Warde-Farley D., Bengio Y. Theano: a cpu and gpu math expression compiler. Proc. Python for Scientific Computing Conference. Austin, 2010, vol. 4, p. 3.

29. Machine Learning in Python. Available: http://scikit-learn.org (accessed: 30.11.16).

Chollet F. Keras: Deep learning library for Theano and tensor ow. Available: http://keras.io (accessed: 30.11.16).

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License