Menu
Publications
2025
2024
2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
Editor-in-Chief
Nikiforov
Vladimir O.
D.Sc., Prof.
Partners
doi: 004.05
Compound quality model for recommender system evaluation
Read the full article
Article in Russian
For citation:
Abstract
For citation:
Tsyplov A.M., Boukhanovsky A.V. Compound quality model for recommender system evaluation. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2025, vol. 25, no. 6, pp. 1117–1124 (in Russian). doi: 10.17586/2226-1494-2025-25-6-1117-1124
Abstract
The study examines approaches to quantifying various effects, such as Position bias, Popularity Bias, and others, in recommender systems. A new quality model of the recommendation algorithms is proposed which reduces the selected metrics to one unit of measurement and determines its impact on the system for each effect. The obtained scores allow for a deeper comparative analysis of various algorithms as well as investigation the behavior of the algorithm in different user segments. For each metric, two conditional marginal distribution densities are built within the framework of the model: separately based on relevant and irrelevant recommendations. Based on the comparison of these densities, the set of possible metric values is divided into normal and critical. The model evaluates the impact of each effect on the system based on the frequency of hitting the values of the corresponding metric in its critical area. To demonstrate how the model works, four recommendation algorithms were analyzed on the MovieLens-100K academic dataset. During the testing, Popularity Bias, the lack of novelty in recommendations, and the tendency of algorithms to recommend objects solely based on user demographic data were evaluated. For each effect, an assessment of its impact on the system is constructed, and an example of predicting an upper estimate of the system quality is given if the corresponding effect is eliminated. The study demonstrated that metrics of effects such as Popularity or Position Bias can change the distribution of absolute values depending on the system. One of the ways to compare different recommendation algorithms more reliably is the proposed quality model. The model is suitable for evaluating personal recommendations, regardless of the scope of application and the algorithm that was used to build them.
Keywords: recommendation systems, ranking, evaluation of the quality of recommendations, popularity bias, position bias, machine learning
References
References
1. Anderson A., Maystre L., Anderson I., Mehrotra R., Lalmas M. Algorithmic effects on the diversity of consumption on spotify // Proc. of the Web Conference. 2020. P. 2155–2165. https://doi.org/10.1145/3366423.3380281
2. Avazpour I., Pitakrat T., Grunske L., Grundy J. Dimensions and metrics for evaluating recommendation systems // Recommendation Systems in Software Engineering. 2014. P. 245–273. https://doi.org/10.1007/978-3-642-45135-5_10
3. Ding H., Kveton B., Ma Y., Park Y., Kini V., Gu Y., et al. Trending now: modeling trend recommendations // Proc. of the 17th ACM Conference on Recommender Systems. 2023. P. 294–305. https://doi.org/10.1145/3604915.3608810
4. Cai Y., Guo J., Fan Y., Ai Q., Zhang R., Cheng X. Hard negatives or false negatives: correcting pooling bias in training neural ranking models // Proc. of the 31st ACM International Conference on Information and Knowledge Management. 2022. P. 118–127. https://doi.org/10.1145/3511808.3557343
5. Abdollahpouri H., Mansoury M., Burke R., Mobasher B. The connection between popularity bias, calibration, and fairness in recommendation // Proc. of the 14th ACM Conference on Recommender Systems. 2020. P. 726–731. https://doi.org/10.1145/3383313.3418487
6. Beel J., Langer S., Genzmehr M., Gipp B., Breitinger C., Nürnberger A. Research paper recommender system evaluation: a quantitative literature survey // Proc. of the International Workshop on Reproducibility and Replication in Recommender Systems Evaluation. 2013. P. 15–22. https://doi.org/10.1145/2532508.2532512
7. Wasilewski J., Hurley N. Incorporating diversity in a learning to rank recommender system // Proc. of the 29th International Florida Artificial Intelligence Research Society Conference. 2016. P. 1–6.
8. Ricci F., Rokach L., Shapira B. Recommender Systems Handbook. Springer, 2010. 842 p.
9. Said A., Bellogin A. Comparative recommender system evaluation: benchmarking recommendation frameworks // Proc. of the 8th ACM Conference on Recommender Systems. 2014. P. 129–136. https://doi.org/10.1145/2645710.2645746
10. Wilhelm M., Ramanathan A., Bonomo A., Jain S., Chi E.H., Gillenwater J. Practical diversified recommendations on YouTube with determinantal point processes // Proc. of the 27th ACM International Conference on Information and Knowledge Management. 2018. P. 2165–2173. https://doi.org/10.1145/3269206.3272018
11. Chang Bo, Meng C., Ma H., Chang S., Gu Y., Peng Y., et al. Cluster anchor regularization to alleviate popularity bias in recommender systems // Proc. of the Companion Proceedings of the ACM Web Conference. 2024. P. 151–160. https://doi.org/10.1145/3589335.3648312
12. Bellogin A., Castells P., Cantador I. Precision-oriented evaluation of recommender systems: an algorithmic comparison // Proc. of the 5th ACM Conference on Recommender Systems. 2011. P. 333–336. https://doi.org/10.1145/2043932.2043996
13. Cremonesi P., Koren Y., Turrin R. Performance of recommender algorithms on top-n recommendation tasks // Proc. of the 4th ACM Conference on Recommender Systems. 2010. P. 39–46. https://doi.org/10.1145/1864708.1864721
14. Abdollahpouri H., Burke R., Mobasher B. Managing popularity bias in recommender systems with personalized re-ranking // Proc. of the 32nd International Florida Artificial Intelligence Research Society Conference. 2019. P. 1–6.
15. Yi X., Yang J., Hong L., Cheng D.Z., Heldt L., Kumthekar A., Zhao Z., Wei L., Chi E. Sampling-bias-corrected neural modeling for large corpus item recommendations // Proc. of the 13th ACM Conference on Recommender Systems. 2019. P. 269–277. https://doi.org/10.1145/3298689.3346996
16. Silveira T., Zhang M., Lin X., Liu Y., Ma S. How good your recommender system is? A survey on evaluations in recommendation // International Journal of Machine Learning and Cybernetics. 2019. V. 10. N 5. P. 813–831. https://doi.org/10.1007/s13042-017-0762-9
17. Akiyama T., Obara K., Tanizaki M. Proposal and evaluation of serendipitous recommendation method using general unexpectedness // CEUR Workshop Proceedings. 2010. V. 676. P. 3–10.
18. Scott L.M., Su-In L. A unified approach to interpreting model predictions // Proc. of the 31st Conference on Neural Information Processing Systems. 2017. P. 1–10.
19. Isinkaye F.O., Folajimi Y.O., Ojokoh B.A. Recommendation systems: principles, methods and evaluation // Egyptian Informatics Journal. 2015. V. 16. N 3. P. 261–273. https://doi.org/10.1016/j.eij.2015.06.005
20. Rhee W., Cho S.-M., Suh B. Countering popularity bias by regularizing score differences // Proc. of the 16th ACM Conference on Recommender Systems. 2022. P. 145–155. https://doi.org/10.1145/3523227.3546757
21. Shani G., Gunawardana A. Evaluating recommendation systems // Recommender Systems Handbook. 2010. P. 257–297. https://doi.org/10.1007/978-0-387-85820-3_8

