Nikiforov
Vladimir O.
D.Sc., Prof.
doi: 10.17586/2226-1494-2017-17-3-490-497
TREE SIMILARITY ESTIMATION BY CALCULATION OF pq-GRAM DISTANCE
Read the full article ';
For citation: Andreeva A.G., Markina T.A. Tree similarity estimation by calculation of pq-gram distance. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2017, vol. 17, no. 3, pp. 490–497 (in Russian). doi: 10.17586/2226-1494-2017-17-3-490-497
Abstract
The paper presents an algorithm for similarity estimation of hierarchical data based on the pq-gram distance calculation. The dependence of the algorithm sensitivity on the selected parameters p andq is analyzed. We show how much the result of the algorithm will change at comparing of two trees that have difference in one random node when one of the nodes of the source tree is deleted, renamed, or an extra node is added. It is demonstrated that such analysis enables to select the parameters p and q in relation with the solving problem. The problem of a tree preliminary evaluation is substantiated - an approximate analysis of the initial level of node differences in the selected pq-grams of the compared trees. The basic terms and definitions relating to the tree-based data structuring algorithms are described. Examples of the algorithm practical application and the details of its implementation on a real problem are shown
References