doi: 10.17586/2226-1494-2025-25-6-1134-1141


ReflectivePrompt: Reflective evolution in autoprompting algorithms

V. N. Zhuravlev, A. R. Khairullin, E. A. Dyagin, A. N. Sitkina, N. I. Kulin


Read the full article  ';
Article in English

For citation:
Zhuravlev V.N., Khairullin A.R., Dyagin E.A., Sitkina A.N., Kulin N.I. ReflectivePrompt: Reflective evolution in autoprompting algorithms. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2025, vol. 25, no. 6, pp. 1134–1141. doi: 10.17586/2226-1494-2025-25-6-1134-1141


Abstract
Autoprompting is the process of automatically selecting optimized prompts for language models, which has been gaining popularity with the rapid advancement of prompt engineering driven by extensive research in the field of Large Language Models. This paper presents ReflectivePrompt — a novel autoprompting method based on evolutionary algorithms that employs a reflective evolution approach for more precise and comprehensive search of optimal prompts. ReflectivePrompt utilizes short-term and long-term reflection operations before crossover and elitist mutation to enhance the quality of the modifications they introduce. This method allows for the accumulation of knowledge obtained throughout the evolution process and updates it at each epoch based on the current population. ReflectivePrompt was tested on 33 datasets for classification and text generation tasks using open-access large language models: T-lite-instruct-0.1 and Gemma3-27b-it. The method demonstrates, on average, a significant improvement (e.g., 28 % on BBH compared to EvoPrompt) in metrics relative to current state-of-the-art approaches, thereby establishing itself as one of the most effective solutions in evolutionary algorithm-based autoprompting.

Keywords: LLM, autoprompting, evolutionary algorithms, reflective evolution, prompt engineering

References
1. Kadavath S., Conerly T., Askell A., Henighan T., Drain D., Perez E., et al. Language models (mostly) know what they know. arXiv, 2022, arXiv:2207.05221. https://doi.org/10.48550/arXiv.2207.05221
2. Wei J., Bosma M., Zhao V.Y., Guu K., Yu A.W., Lester B., et al. Finetuned language models are zero-shot learners. arXiv, 2021, arXiv:2109.01652. https://doi.org/10.48550/arXiv.2109.01652
3. Liu P., Yuan W., Fu J., Jiang Z., Hayashi H., Neubig G. Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing. ACM Computing Surveys, 2023, vol. 55, no. 9, pp. 1–35. https://doi.org/10.1145/3560815
4. Brown T.B., Mann B., Ryder N., Subbiah M., Kaplan J., Dhariwal P., et al. Language models are few-shot learners. arXiv, 2020, arXiv:2005.14165. https://doi.org/10.48550/arXiv.2005.14165
5. Wang N., Peng Z., Que H., Liu J., Zhou W., Wu Y., et al. RoleLLM: benchmarking, eliciting, and enhancing role-playing abilities of large language models. Proc. of the Annual Meeting of the Association for Computational Linguistics, 2024, pp. 14743–14777. https://doi.org/10.18653/v1/2024.findings-acl.878
6. Wei J., Wang X., Schuurmans D., Bosma M., Ichter B., Xia F., et al. Chain-of-thought prompting elicits reasoning in large language models. arXiv, 2022, arXiv:2201.11903. https://doi.org/10.48550/arXiv.2201.11903
7. Wang L., Xu W., Lan Y., Hu Z., Lan Y., Lee R.K.-W., Lim E.-P. Plan-and-solve prompting: improving zero-shot chain-of-thought reasoning by large language models. Proc. of the 61st Annual Meeting of the Association for Computational Linguistics, 2023, vol. 1, pp. 2609–2634. https://doi.org/10.18653/v1/2023.acl-long.147
8. Leidinger A., van Rooij R., Shutova E. The language of prompting: What linguistic properties make a prompt successful? Proc. of the Findings of the Association for Computational Linguistics: EMNLP, 2023, pp. 9210–9232. https://doi.org/10.18653/v1/2023.findings-emnlp.618
9. Shin T., Razeghi Y., Logan R.L., Wallace E., Singh S. AutoPrompt: Eliciting knowledge from language models with automatically generated prompts. Proc. of the Conference on Empirical Methods in Natural Language, 2020, pp. 4222–4235. https://doi.org/10.18653/v1/2020.emnlp-main.346
10. Kwon M., Kim G., Kim J., Lee H., Kim J. StablePrompt: automatic prompt tuning using reinforcement learning for large language models. arXiv, 2024, arXiv:2410.07652. https://doi.org/10.48550/arXiv.2410.07652
11. Guo Q., Wang R., Guo J., Li B., Song K., Tan X., et al. EvoPrompt: Connecting large language models with evolutionary algorithms yields powerful prompt optimizers. arXiv, 2023, arXiv:2309.08532. https://doi.org/10.48550/arXiv.2309.08532
12. Prasad A., Hase P., Zhou X., Bansal M. GrIPS: gradient-free, edit-based instruction search for prompting large language models. Proc. of the 17th Conference of the European Chapter of the Association for Computational Linguistics, 2023, pp. 3845–3864. https://doi.org/10.18653/v1/2023.eacl-main.277
13. Schulhoff S., Ilie M., Balepur N., Kahadze K., Liu A., Si C., et al. The prompt report: a systematic survey of prompt engineering techniques. arXiv, 2024, arXiv:2406.06608. https://doi.org/10.48550/arXiv.2406.06608
14. Liu P., Yuan W., Fu J., Jiang Z., Hayashi H., Neubig G. Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing. ACM Computing Surveys, 2023, vol. 55, no. 9, pp. 1–35. https://doi.org/10.1145/3560815
15. Li Y.B., Wu K. Spell: semantic prompt evolution based on a LLM. arXiv, 2023, arXiv:2310.01260. https://doi.org/10.48550/arXiv.2310.01260
16. Pan R., Xing S., Diao S., Sun W., Liu X., Shum K., et al. Plum: prompt learning using metaheuristic. arXiv, 2023, arXiv:2311.08364. https://doi.org/10.48550/arXiv.2311.08364
17. Fernando C., Banarse D., Michalewski H., Osindero S., Rocktäschel T. Promptbreeder: self-referential self-improvement via prompt evolution. arXiv, 2023, arXiv:2309.16797. https://doi.org/10.48550/arXiv.2309.16797
18. Eiben A.E., Smith J.E. Introduction to Evolutionary Computing. Springer, 2015, 287 p.
19. Ye H., Wang J., Cao Z., Berto F., Hua C., Kim H., et al. ReEvo: large language models as hyper-heuristics with reflective evolution. arXiv, 2024, arXiv:2402.01145. https://doi.org/10.48550/arXiv.2402.01145
20. Holland J.H. Genetic algorithms. Scientific American, 1992, vol. 267, no. 1, pp. 66–72. https://doi.org/10.1038/scientificamerican0792-66
21. Lipowski A., Lipowska D. Roulette-wheel selection via stochastic acceptance. Physica A: Statistical Mechanics and its Applications, 2012, vol. 391, no. 6, pp. 2193–2196. https://doi.org/10.1016/j.physa.2011.12.004
22. Storn R., Price K. Differential evolution–a simple and efficient heuristic for global optimization over continuous spaces. Journal of Global Optimization, 1997, vol. 11, no. 4, pp. 341–359. https://doi.org/10.1023/a:1008202821328
23. Russell S., Norvig P. Artificial Intelligence: a Modern Approach. Pearson, 2009, 1152 p.
24. Kirkpatrick S., Gelatt C.D., Vecchi M.P. Optimization by simulated annealing. Science, 1983, vol. 220, no. 4598, pp. 671–680. https://doi.org/10.1126/science.220.4598.671
25. Glover F. Future paths for integer programming and links to artificial intelligence. Computers and Operations Research, 1986, vol. 13, no. 5, pp. 533–549. https://doi.org/10.1016/0305-0548(86)90048-1
26. Geem Z.W., Kim J.H., Loganathan G.V. A new heuristic optimization algorithm: harmony search. Simulation, 2001, vol. 76, no. 2, pp. 60–68. https://doi.org/10.1177/003754970107600201
27. Larranaga P. A review on estimation of distribution algorithms. Genetic Algorithms and Evolutionary Computation, 2002, vol. 2, pp. 57–100. https://doi.org/10.1007/978-1-4615-1539-5_3
28. Ross B.J. A Lamarckian evolution strategy for genetic algorithms. Practical Handbook of Genetic Algorithms, 2019, pp. 1–16. https://doi.org/10.1201/9780429128356-1
29. Voudouris C., Tsang E.P., Alsheddy A. Guided local search. International Series in Operations Research & Management Science, 2010, vol. 146, pp. 321–361. https://doi.org/10.1007/978-1-4419-1665-5_11
30. Dorigo M., Maniezzo V., Colorni A. Ant system: Optimization by a colony of cooperating agents. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 1996, vol. 26, no. 1, pp. 29–41. https://doi.org/10.1109/3477.484436
31. Shibasaka K., Kanazawa K., Yasunaga M. Decoupling-capacitor allocation problem solved by genetic algorithm. Proc. of the IEEE Electrical Design of Advanced Packaging Systems Symposium (EDAPS), 2013, pp. 225–228. https://doi.org/10.1109/edaps.2013.6724430
32. Kim H., Kim M., Berto F., Kim J., Park J. DevFormer: a symmetric transformer for context-aware device placement. arXiv, 2022, arXiv:2205.13225. https://doi.org/10.48550/arXiv.2205.13225
33. Gohil A., Tayal M., Sahu T., Sawalpurkar V. Travelling salesman problem: parallel implementations & analysis. arXiv, 2022, arXiv:2205.14352. https://doi.org/10.48550/arXiv.2205.14352
34. Liu M.X., Liu F., Fiannaca A.J., Koo T., Dixon L., Terry M., Cai C.J. "We Need Structured Output": towards user-centered constraints on large language model output // Proc. of the Extended Abstracts of the CHI Conference on Human Factors in Computing Systems, 2024, pp. 1–9. https://doi.org/10.1145/3613905.3650756
35. Kamath A., Ferret J., Pathak S., Vieillard N., Merhej R., Perrin S., et al. Gemma 3 technical report. arXiv, 2025, arXiv:2503.19786. https://doi.org/10.48550/arXiv.2503.19786
36. Agrawal L.A., Tan S., Soylu D., Ziems N., Khare R., Opsahl-Ong K., et al. GEPA: reflective prompt evolution can outperform reinforcement learning. arXiv, 2025, arXiv:2507.19457. https://doi.org/10.48550/arXiv.2507.19457


Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License
Copyright 2001-2025 ©
Scientific and Technical Journal
of Information Technologies, Mechanics and Optics.

Яндекс.Метрика