doi: 10.17586/2226-1494-2021-21-4-562-570

Identification of user accounts by image comparison: the pHash-based approach

V. D. Oliseenko, M. V. Abramov, A. L. Tulupyev

Read the full article  ';
Article in русский

For citation:
Oliseenko V.D., Abramov M.V., Tulupyev A.L. Identification of user accounts by image comparison: the pHash-based approach. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2021, vol. 21, no. 4, pp. 562–570 (in Russian). doi: 10.17586/2226-1494-2021-21-4-562-570


The study presents a new approach to the identification of various online social networks’ users that allows for matching of accounts belonging to the same person. To achieve this goal, images extracted from digital footprints of users are used. The proposed new approach compares not only the main images of a user’s profile, but also all the elements of the graphic content published in a user’s account. The described approach requires a pairwise comparison of the images published by users in two accounts from different online social networks on the “all-to-all” principle to assess the probability that these accounts belong to the same user. The comparison of the labeled graphical content elements is performed using the well-known perceptual hash method called pHash. A computational experiment was conducted to evaluate the results obtained by using the proposed approach, the f1-score achieved 0.886 for three matched images. It is shown that the results of the pHash image comparison can be used for account identification as a standalone approach as well as to complement other identification approaches. The proposed algorithm can be used to supplement the existing methods for comparative analysis of accounts. Automation of the proposed approach provides a tool for aggregation and makes it possible to obtain more information about users, assessing the depth of their personality features. The results can be applied to forming a digital twin of the user for further description of his (or her) traits in the tasks of protection against social engineering attacks, targeted advertising, assessment of creditworthiness, and other studies related to online social networks and social sciences.

Keywords: online social networks, user identification, image processing, pHash, data science, social engineering attacks

Acknowledgements. This work was carried out within the framework of the project under the state assignment of SPC RAS SPIIRAS No. 0073-2019-0003 (approach formation); supported by Saint Petersburg State University, project No. 73555239 (implementation of the approach and its approbation); with the financial support of the RFBR, project No. 20-07-00839 (approbation of the results in the prototype of the software package).

  1. Mineraud J., Mazhelis O., Su X., Tarkoma S. A gap analysis of Internet-of-Things platforms. ComputerCommunications, 2016, vol. 89-90, pp. 5–16.
  2. Branitskiy A.A., Kotenko I.V. Analysis and classification of methods for network attack detection. SPIIRAS Proceedings, 2016, no. 2(45), pp. 207–244. (in Russian).
  3. Parkinson S., Ward P., Wilson K., Miller J. Cyber threats facing autonomous and connected vehicles: Future challenges. IEEE Transactions on Intelligent Transportation Systems, 2017, vol. 18, no. 11, pp. 2898–2915.
  4. Du M., Wang K., Chen Y., Wang X., Sun Y. Big data privacy preserving in multi-access edge computing for heterogeneous Internet of Things. IEEE Communications Magazine, 2018, vol. 56, no. 8, pp. 62–67.
  5. Goel S., Williams K., Dincelli E. Got phished? Internet security and human vulnerability. Journal of the Association for Information Systems, 2017, vol. 18, no. 1, pp. 22–44.
  6. Abramov M.V. Automation of the social networks websites content analysis in the problems of forecasting the protection of the information systems users from social engineering attacks. Automation of Control Processes, 2018, no. 1(51), pp. 34–40. (in Russian)
  7. Khlobystova A., Korepanova A., Maksimov A., Tulupyeva T. An approach to quantification of relationship types between users based on the frequency of combinations of non-numeric evaluations. Advances in Intelligent Systems and Computing, 2020, vol. 1156 AISC, pp. 206–213.
  8. KhlobystovaA.O., AbramovM.V., TulupyevaT.V., TulupyevA.L. Social influence on the user in social network: types of communications in assessment of the behavioral risks connected with the socio-engineering attacks. Administrative Consulting, 2019, no. 3, pp. 104–117. (inRussian).
  9. AbramovM.V., TulupevaT.V., TulupevA.L.Social Engineering Attacks: Social Networks and User Security Estimates. St. Petersburg, SUAI Publ., 2018, 266 p. (in Russian)
  10. AzarovA.A., AbramovM.V., TulupyevaT.V., TulupyevA.L. The analysis of the information systems'' users'' groups protection analysis from the social engineering attacks: the principle and program implementation. Computer Tools in Education Journal, 2015, no. 4, pp. 52–60. (in Russian)
  11. Krylov B., Abramov M., Khlobystova A. Automated player activity analysis for a serious game about social engineering. Studies in Systems, Decision and Control, 2020, vol. 337, pp. 587–599.
  12. Li Y., Su Z., Yang J., Gao C. Exploiting similarities of user friendship networks across social networks for user identification. Information Sciences, 2020, vol. 506, pp. 78–98.
  13. Korepanova A.A., Oliseenko V.D., Abramov M.V. Applicability of similarity coefficients in social circle matching. International Conference on Soft Computing and Measurements, 2020, vol. 1, pp. 39–42. (in Russian)
  14. Korepanova A.A., Oliseenko V.D., Abramov M.V., Tulupyev A.L. Application of machine learning methods in the task of identifying user accounts in two social networks. Computer Tools in Education Journal, 2019, no. 3, pp. 29–43. (in Russian).
  15. Raad E., Chbeir R., Dipanda A. User profile matching in social networks. Proc. 13th International Conference on Network-Based Information Systems (NbiS), 2010,pp. 297–304.  
  16. Schwartz H.A., Eichstaedt J.C., Kern M., Dziurzynski L., Ramones S.M., Adrawal M., Shah A., Kosinski M., Stillwell D., Seligman M.E.P., Ungar L.H. Personality, gender, and age in the language of social media: The open-vocabulary approach. PloS One, 2013, vol. 8, no. 9, pp. e73791.
  17. Liu S.,Wang S., Zhu F., Zhang J., Krishnan R. HYDRA: Large-scale social identity linkage via heterogeneous behavior modeling. Proc. of the ACM SIGMOD International Conference on Management of Data, 2014, pp. 51–62.
  18. Ozga F., Onnela J.-P., DeGruttola V. Bayesian method for inferring the impact of geographical distance on intensity of communication. Scientific Reports, 2020, vol. 10, no. 1, pp. 11775.
  19. Sokhin T., Butakov N., Nasonov D. User profiles matching for different social networks based on faces identification. Lecture Notes in Computer Science, 2019, vol. 11734, pp. 551–562.
  20. Oh S.J., Benenson R., Fritz M., Schiele B. Person recognition in personal photo collections. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, vol. 42, no. 1, pp. 203–220.
  21. Ranaldi L., Zanzotto F.M. Hiding Your Face Is Not Enough: user identity linkage with image recognition. Social Network Analysis and Mining, 2020, vol. 10, no. 1, pp. 56.
  22. Marr D., Hildreth E. Theory of edge detection. Proceedings of the Royal Society of London. Series B. Biological Sciences, 1980, vol. 207, no. 1167, pp. 187–217.
  23. Rudakov I.V., Vasiutovich I.M. Analysis of perceptual image hash functions. Science and Education of the Bauman MSTU, 2015, no. 8, pp. 269–280. (in Russian).
  24. Zauner C. Implementation and benchmarking of perceptual image hash functions. Master’s thesis. 2010, 94 p.
  25. Zauner C., Steinebach M., Hermann E. Rihamark: Perceptual image hash benchmarking. Proceedings of SPIE, 2011, vol. 7880, pp. 78800X.
  26. Oliva A., Torralba A. Modeling the shape of the scene: a holistic representation of the spatial envelope. International Journal of Computer Vision, 2001, vol. 42, no. 3, pp. 145–175.
  27. Wang X., Zhou X., Zhang Q., Xu B., Xue J. Image alignment based perceptual image hash for content authentication. Signal Processing: Image Communication, 2020, vol. 80, pp. 115642.
  28. Tuncer T., Dogan S., Abdar M., Pławiak P. A novel facial image recognition method based on perceptual hash using quintet triple binary pattern. Multimedia Tools and Applications, 2020, vol. 79, no. 39-40, pp. 29573–29593.
  29. GruzmanI.S., KirichukV.S., KosykhV.P., PeretiaginG.I., SpektorA.A.Digital Image Processing in Information Systems. Novosibirsk, NSTU Publ., 2002, 352 p. (in Russian)
  30. Hamming R.W. Error detecting and error correcting codes. Bell System Technical Journal, 1950, vol. 29, no. 2, pp. 147–160.
  31. Cook J., Ramadas V. When to consult precision-recall curves. Stata Journal, 2020, vol. 20, no. 1, pp. 131–148.

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License
Copyright 2001-2021 ©
Scientific and Technical Journal
of Information Technologies, Mechanics and Optics.
All rights reserved.