ALGORITHM FOR SEMANTIC TEXT ANALYSIS BY MEANS OF BASIC SEMANTIC TEMPLATES WITH DELETION

A. V. Mochalova


Read the full article 
Article in Russian


Abstract

 The systems of automatic text processing have become more and more important due to the constant growth of textual data. One of the main issues arising in such systems is a problem of semantic analysis. The paper deals with an algorithm for finding semantic dependencies by means of basic semantic templates with deletion. While working with the Drools expert system (and PHREAK algorithm for fast pattern matching) we have developed and implemented a semantic analyzer for construction of semantic dependencies between parts of a sentence. During the semantic analysis we add some text parts to the priority queue according to the rules described in the semantic templates, and then at each iteration of the sentence being analyzed we drop some segment of the analyzed text which has the highest priority in the queue. To determine the priority in this queue two values are used: the priority of semantic relationship group and word position. The proposed algorithm is implemented in Java. We have prepared 2160 rules using Drools expert system. The software implementation of the proposed algorithm has shown its applicability for the systems of automatic text processing. Testing results have proved that suggested algorithm of semantic analysis without Drools expert system operates 6-8 times slower, on the average. We use proposed semantic analyzer as a composite module to intellectual question-answering system.


Keywords: semantic dependencies, semantic analyzer, semantic templates

References
1.     Rabchevsky E.A. Avtomaticheskoe postroenie ontologii na osnove leksiko-sintaksicheskikh shablonov dlya informatsionnogo poiska [Automatic ontology construction based on lexical-syntactic patterns for information retrieval]. Trudy XI Vserossiiskoi Nauchnoi Konferentsii "Elektronnye Biblioteki: Perspektivnye Metody i Tekhnologii, Elektronnye Kollektsii" [Proc. XI All-Russian Scientific Conference of Digital Libraries: Advanced Methods and Technologies, Digital Collections]. Petrozavodsk, 2009, pp. 69–77.
2.     Fillmore C.J. The Case for Case. In Universals of Linguistic Theory. Eds. E. Bach, R.T. Harms. NY: Holt, Rinehart and Winston, 1968.
3.     Fillmore C.J. The Case for Case Reopened. In Grammatical Relations. NY, 1977, pp. 59–81.
4.     Chubinidze K.A. Metod sintaktiko-semanticheskikh shablonov i ego primenenie v informatsionnoi tekhnologii interpretatsii tekstov: dis… . kand. tekhn. nauk [Method of syntactic and semantic patterns and its application in information technology, the interpretation of texts. PhD eng. sci. dis.]. Moscow, 2006, 156 p.
5.     Bol'shakov I.A. Kakie slovosochetaniya sleduet khranit' v slovaryakh? [Which phrase should be stored in the dictionary?]. Trudy Mezhdunarodnogo Seminara Dialog'2002 po Komp'yuternoi Lingvistike i ee Prilozheniyam [Proc. Int. Workshop on Computational Linguistics Dialog'2002 and its Applications]. Protvino, 2002, vol. 2, pp. 61–69.
6.     Zagorul'ko Yu.A., Sidorova. E.A. Sistema izvlecheniya predmetnoi terminologii iz teksta na osnove leksiko-sintaksicheskikh shablonov [Extraction system subject terminology from text based on lexical and syntactic patterns]. Trudy XIII Mezhdunarodnoi Konferentsii Problemy Upravleniya i Modelirovaniya v Slozhnykh Sistemam [Proc. XIII Int. Conf. on Control and Modeling Problems in Complex Systems]. Samara, 2011, pp. 506–511.
7.     Hearst M.A. Automatic acquisition of hyponyms from large text corpora. Proc. 14th International Conference on Computational Linguistics, 1992, pp. 539–545.
8.     Lyons J. Introduction to Theoretical Linguistics. Cambridge University Press, 1968, 536 p.
9.     Sokirko A.V. Semanticheskie slovari v avtomaticheskoi obrabotke teksta (po materialam sistemy DIALING). Dis.…kand. tekhn. nauk. [Semantic dictionaries in automatic text processing (based on Dialing system). PhD eng. sci. dis.]. Moscow, 2001, 120 p.
10.Downey A.B. Think Python. O'Reilly Media, 2012, 300 p.
11.Drools Documentation. Available at: http://docs.jboss.org/drools/release/6.0.1.Final/drools-docs/html_single (accessed 25.05.2014).
12.Zaliznyak A.A. Grammaticheskii Slovar' Russkogo Yazyka. Slovoizmenenie [Grammatical Dictionary of the Russian language.]. Moscow, Russkii Yazyk, 1980, 880 p.
13.Belonogov G.G., Zelenkov Yu.G. Algoritm morfologicheskogo analiza russkikh slov [Algorithm for morphological analysis of Russian words]. Voprosy Informatsionnoi Teorii i Praktiki, 1985, no. 53, pp. 62–93.
14.O Programme Mystem[About Mystem Software]. Available at: http://api.yandex.ru/mystem (accessed 17.03.2014).
15.Zelenkov Yu.G., Segalovich I.V., Titov V.A. Veroyatnostnaya model' snyatiya morfologicheskoi omonimii na osnove normalizuyushchikh podstanovok i pozitsii sosednikh slov [Probabilistic model of morphological homonymy removal based on normalizing substitutions and positions of neighboring words]. Computational Linguistics and Intellectual Technologies, 2005, pp. 188–197.
16.Avtomaticheskaya Obrabotka Teksta[Automatic Text Processing]. Available at: http://www.aot.ru (accessed 12.05.2014).
17.Mochalova A.V., Mochalov V.A. Intellektual'naya voprosno-otvetnaya sistema [Intellectual question-answer system]. Informatsionnye Tekhnologii, 2011, no. 5, pp. 6–12.
Copyright 2001-2017 ©
Scientific and Technical Journal
of Information Technologies, Mechanics and Optics.
All rights reserved.

Яндекс.Метрика