Menu
Publications
2024
2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
Editor-in-Chief
Nikiforov
Vladimir O.
D.Sc., Prof.
Partners
doi: 10.17586/2226-1494-2020-20-5-714-721
SEARCH OF CLONES IN PROGRAM CODE
Read the full article ';
Article in Russian
For citation:
Abstract
For citation:
Osadchaya A.O., Isaev I.V. Search of clones in program code. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2020, vol. 20, no. 5, pp. 714–721 (in Russian). doi: 10.17586/2226-1494-2020-20-5-714-721
Abstract
Subject of Research. The paper presents research of existing approaches and methods for the search of clones in the program code. As a result of the study, a method is developed that implements a semantic approach for the search of duplicated fragments focused on all kinds of clones. Method. The developed method is based on the analysis of the program dependency graph built from the source code files. To detect duplicate fragments, for each source code file dependency program graphs are generated with the nodes hashed on the basis of their content properties. Each pair of nodes is selected from each equivalence class, and two isomorphic subgraphs are identified that include a pair of nodes. If a pair of clones is included into another pair, it is removed from the set of the found pairs of duplicated fragments. A set of clones is generated from the pairs of duplicated fragments that share the same isomorphic subgraphs, that is, the pairs of clones are expanded. Main Results. To evaluate the efficiency of the developed method of searching for clones, the files have been compared for determination of the clone types that the system using this method detects, and the testing has been performed on the real system components. The results of the developed system have been compared to the real ones. Practical Relevance. The proposed algorithm makes it possible to automate the analysis of source files. Detecting of clones in the program code is a priority direction in code analysis, since the detection of duplicate fragments provides for the fight against unscrupulous copying of program code.
Keywords: clones in program code, code duplication, duplicated fragments, code clone types, refactoring, code analysis, code reuse
References
References
1. Deshpande A., Riehle D. The total growth of open source. IFIP International Federation for Information Processing, 2008, vol. 275, pp. 197–209. doi: 10.1007/978-0-387-09684-1_16
2. Kapser C., Godfrey M.W. Toward a taxonomy of clones in source code: A case study. Proc. of the Workshop Evolution of Large-scale Industrial Software Applications (ELISA), 2003, pp. 67–78.
3. Sargsian S.S. Search methods for code clones and semantic errors based on semantic program analysis. Dissertation for the degree of candidate of physical and mathematical sciences. Moscow, ISPRAS, 2016, p. 10–22. (in Russian)
4. Karpov Iu.G. Model Checking. Verification of Parallel and Distributed Software Systems. St. Petersburg, BHV Publ., 2010, 552 p. (in Russian)
5. Bacon D.F., Graham S.L., Sharp O.J. Compiler transformations for high-performance computing. ACM Computing Surveys, 1994, vol. 26, no. 4, pp. 345–420. doi: 10.1145/197405.197406
6. Glass R.L. Frequently forgotten fundamental facts about software engineering. IEEE Software, 2001, vol. 18, no. 3, pp. 110–112. doi: 10.1109/MS.2001.922739
7. Akhin M.Kh., Itcykson V.M. Source code clone detection: theory and practice. Sistemnoe Programmirovanie, 2010, vol. 5, no. 1, pp. 145–163. (in Russian)
8. Hunt A., Thomas D. The Pragmatic Programmer: From Journeyman to Master. Addison-Wesley Professional, 1999, 352 p.
9. Fowler M., Beck K., Brant J., Opdyke W., Roberts D. Refactoring: Improving the Design of Existing Code. Addison-Wesley Professional, 1999, 464 p.
10. Miller G.A. The magical number seven, plus or minus two: some limits on our capacity for processing information. Psycological Review, 1956, vol. 63, no. 2, pp. 81–97. doi: 10.1037/h0043158
11. Ducasse S., Rieger M., Demeyer S. A language independent approach for detecting duplicated code. Proc. 15th International Conference on Software Maintenance (ICSM), 1999, pp. 109–118. doi: 10.1109/ICSM.1999.792593
12. Cordy J.R. The TXL source transformation language. Science of Computer Programming, 2006, vol. 61, no. 3, pp. 190–210. doi: 10.1016/j.scico.2006.04.002
13. Wettel R., Marinescu R. Archeology of code duplication: Recovering duplication chains from small duplication fragments. Proc. 7th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC 2005), 2005, pp. 63¬70. doi: 10.1109/SYNASC.2005.20
14. Livieri S., Higo Y., Matsushita M., Inoue K. Very-large scale code clone analysis and visualization of open source programs using distributed CCFinder: D-CCFinder. Proc. 29th International Conference on Software Engineering (ICSE), 2007, pp. 106–115. doi: 10.1109/ICSE.2007.97
15. Jiang L., Misherghi G., Su Z., Glondu S. DECKARD: Scalable and accurate tree-based detection of code clones. Proc. 29th International Conference on Software Engineering (ICSE), 2007, pp. 96–105. doi: 10.1109/ICSE.2007.30