<div>
	Joint learning of agents and graph embeddings in a conveyor belt control problem</div>

Konstantin E. Rybkin, Andrey A. Filchenkov , Artur A. Azarov, Zabashta Alexey S, Shalyto Anatoly A.

2022 , VOLUME 22, NUMBER 6 ( november-december )

ISSN 2226-1494 (print), ISSN 2500-0373 (online)

Publications

Editor-in-Chief

Nikiforov
Vladimir O.
D.Sc., Prof.

Partners

doi: 10.17586/2226-1494-2022-22-6-1187-1196

Joint learning of agents and graph embeddings in a conveyor belt control problem

K. E. Rybkin, A. A. Filchenkov, A. A. Azarov, A. S. Zabashta, A. A. Shalyto

Read the full article

Article in Russian

For citation:

Rybkin K.E., Filchenkov A.A., Azarov A.A., Zabashta A.S., Shalyto A.A. Joint learning of agents and graph embeddings in a conveyor belt control problem. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2022, vol. 22, no. 6, pp. 1187–1196 (in Russian). doi: 10.17586/2226-1494-2022-22-6-1187-1196

Abstract

We focus on the problem of routing a conveyor belts system based on a multi-agent approach. Most of these airport baggage belt conveyor systems use routing algorithms based on manual simulation of conveyor behavior. This approach does not scale well, and new research in machine learning proposes to solve the routing problem using reinforcement learning. To solve this problem, we propose an approach to joint learning of agents and vector representations of a graph. Within this approach, we develop a QSDNE algorithm, which uses DQN agents and SDNE embeddings. A comparative analysis was carried out with multi-agent routing algorithms without joint learning. The results of the QSDNE algorithm showed its effectiveness in optimizing the delivery time and energy consumption in conveyor systems as it helped to reduce mean delivery time by 6 %. The proposed approach can be used to solve routing problems with complex path estimation functions and dynamically changing graph topologies, and the proposed algorithm can be used to control conveyor belts at airports and in manufacturing workshops.

Keywords: multi-agent learning, reinforcement learning, adaptive routing, conveyor belt, graph representation

Acknowledgements. The study was supported by a grant from the Russian Science Foundation (project no. 20-19-00700).

References

The World of Air Transport in 2019. ICAO Annual Report, 2019. Available at: https://www.icao.int/annual-report-2019/Pages/the-world-of-air-transport-in-2019.aspx (accessed: 01.10.2022).
COVID-19 Air Travel Recovery, OAG, 2022. Available at: https://www.oag.com/coronavirus-airline-schedules-data. (sccessed: 01.10.2022).
2019 SITA Baggage IT Insights report, SITA, 2019. Available at: https://www.sita.aero/resources/surveys-reports/baggage-it-insights-2019. (accessed: 06.09.2022)
Sørensen R.A., Nielsen M., Karstoft H. Routing in congested baggage handling systems using deep reinforcement learning. Integrated Computer-Aided Engineering,2020,vol. 27, no. 2,pp. 139–152.https://doi.org/10.3233/ICA-190613
Kang Y., Wang X., Lan Z. Q-adaptive: A multi-agent reinforcement learning based routing on dragonfly network. Proc. of the 30^th International Symposium on High-Performance Parallel and Distributed Computing(HPDC),2021,pp. 189–200.https://doi.org/10.1145/3431379.3460650
Choi S., Yeung D.Y. Predictive Q-routing: A memory-based reinforcement learning approach to adaptive traffic control. Advances in Neural Information Processing Systems, 1995, vol. 8, pp. 945–951.
MukhutdinovD., FilchenkovA., ShalytoA., VyatkinV. Multi-agen tdeep learning fors imultane ous optimization for time and energy indistributed routing system. Future Generation Computer Systems, 2019, vol. 94, pp. 587–600. https://doi.org/10.1016/j.future.2018.12.037
Mukhudinov D. Decentralized conveyor system control algorithm using multi-agent reinforcement learning methods. MSc Dissertation. St. Petersburg, ITMO University, 2019, 92 p. Available at: http://is.ifmo.ru/diploma-theses/2019/2_5458464771026191430.pdf (accessed: 01.10.2022). (in Russian)
Dantzig G.B. Origins of the simplex method. A history of scientific computing,1990,pp. 141–151.https://doi.org/10.1145/87252.88081
Kuznetcov A.V., Kholod N.I., Kostevich L.S. Guide to Solving Mathematical Programming Problems.Minsk, Vyshjejshaja shkola Publ., 1978, 256 p. (in Russian)
Vutukury S., Garcia-Luna-Aceves J.J. MDVA: A distance-vector multipath routing protocol.Proc.of the 20^th Annual Joint Conference on the IEEE Computer and Communications Societies (INFOCOM), 2001,vol. 1,pp. 557–564.https://doi.org/10.1109/INFCOM.2001.916780
Albrightson R., Garcia-Luna-Aceves J.J., Boyle J. EIGRP– Afast routing protocol based on distance vectors, 1994.
Verma A., Bhardwaj N. A review on routing information protocol (RIP) and open shortest path first (OSPF) routing protocol. International Journal of Future Generation Communication and Networking,2016,vol. 9, no. 4,pp. 161–170.https://doi.org/10.14257/ijfgcn.2016.9.4.13
Bang-Jensen J., Gutin G.Z. Digraphs: Theory, Algorithms and Applications. Springer Science & Business Media, 2009, 795 p.
Clausen T., Jacquet P. Optimized link state routing protocol (OLSR). 2003, no. RFC3626. https://doi.org/10.17487/RFC3626
Jacquet P., Muhlethaler P., Clausen T., Laouiti A., Qayyum A., Viennot L. Optimized link state routing protocol for adhoc networks. Proceedings. IEEE International Multi Topic Conference, 2001. IEEE INMIC 2001. Technology for the 21st Century, 2001,pp. 62–68.https://doi.org/10.1109/INMIC.2001.995315
Noto M., Sato H. A method for the shortest path search by extended Dijkstra algorithm. Smc 2000 conference proceedings. 2000 IEEE International Conference on Systems, Man and Cybernetics,2000, vol. 3, pp. 2316–2320.https://doi.org/10.1109/icsmc.2000.886462
Mammeri Z. Reinforcement learning based routing in networks: Review and classification of approaches. IEEEAccess,2019, vol. 7, pp. 55916–55950. https://doi.org/10.1109/ACCESS.2019.2913776
Yao Z., Wang Y., Qiu X. DQN-based energy-efficient routing algorithm in software-defined data centers. International Journal of Distributed Sensor Networks, 2020, vol. 16, no. 6. https://doi.org/10.1177/1550147720935775
Belkin M., Niyogi P. Laplacian eigenmaps and spectral techniques for embedding and clustering. Advances in Neural Information Processing Systems, 2002.
Gao B., Pavel L. On the properties of the softmax function with application in game theory and reinforcement learning. arXiv, 2017, arXiv:1704.00805. https://doi.org/10.48550/arXiv.1704.00805
Barros C.D., Mendonça M.R., Vieira A.B., Ziviani A. A survey on embedding dynamic graphs. ACM Computing Surveys,2021, vol. 55, no. 1,pp. 1–37.https://doi.org/10.1145/3483595
Ou M., Cui P., Pei J., Zhang Z., Zhu W. Asymmetric transitivity preserving graph embedding. Proc. of the 22^nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,2016,pp. 1105–1114.https://doi.org/10.1145/2939672.2939751
Grover A., Leskovec J. Node2vec: Scalable feature learning for networks. Proc. of the 22^nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,2016,pp. 855–864.https://doi.org/10.1145/2939672.2939754
Wang D., Cui P., Zhu W. Structural deep network embedding. Proc. of the 22^nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,2016, pp. 1225–1234.https://doi.org/10.1145/2939672.2939753
Goyal P., Chhetri S.R., Canedo A. dyngraph2vec: Capturing network dynamics using dynamic graph representation learning. Knowledge-Based Systems, 2020,vol. 187,pp. 104816.https://doi.org/10.1016/j.knosys.2019.06.024
Ma Y., Guo Z., Ren Z., Tang J., Yin D. Streaming graph neural networks. Proc. of the 43^rd International ACM SIGIR Conference on Research and Development in Information Retrieval,2020,pp. 719–728.https://doi.org/10.1145/3397271.3401092
Kool W., Van Hoof H., Welling M. Attention, learn to solve routing problems! Proc. of the 7^th International Conference on Learning Representations (ICLR), 2019.
Komer B., Bergstra J., Eliasmith C. Hyperopt-sklearn. Automated Machine Learning, 2019, pp. 97–111.https://doi.org/10.1007/978-3-030-05318-5_5

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License