ReflectivePrompt: Reflective evolution in autoprompting algorithms

Viktor N. Zhuravlev, Artur R. Khairullin, Ernest A. Dyagin, Alena N. Sitkina, Nikita I. Kulin

2025 , VOLUME 25, NUMBER 6 ( november-december )

ISSN 2226-1494 (print), ISSN 2500-0373 (online)

Publications

Editor-in-Chief

Nikiforov
Vladimir O.
D.Sc., Prof.

Partners

doi: 10.17586/2226-1494-2025-25-6-1134-1141

ReflectivePrompt: Reflective evolution in autoprompting algorithms

V. N. Zhuravlev, A. R. Khairullin, E. A. Dyagin, A. N. Sitkina, N. I. Kulin

Read the full article

Article in English

For citation:

Zhuravlev V.N., Khairullin A.R., Dyagin E.A., Sitkina A.N., Kulin N.I. ReflectivePrompt: Reflective evolution in autoprompting algorithms. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2025, vol. 25, no. 6, pp. 1134–1141. doi: 10.17586/2226-1494-2025-25-6-1134-1141

Abstract

Autoprompting is the process of automatically selecting optimized prompts for language models, which has been gaining popularity with the rapid advancement of prompt engineering driven by extensive research in the field of Large Language Models. This paper presents ReflectivePrompt — a novel autoprompting method based on evolutionary algorithms that employs a reflective evolution approach for more precise and comprehensive search of optimal prompts. ReflectivePrompt utilizes short-term and long-term reflection operations before crossover and elitist mutation to enhance the quality of the modifications they introduce. This method allows for the accumulation of knowledge obtained throughout the evolution process and updates it at each epoch based on the current population. ReflectivePrompt was tested on 33 datasets for classification and text generation tasks using open-access large language models: T-lite-instruct-0.1 and Gemma3-27b-it. The method demonstrates, on average, a significant improvement (e.g., 28 % on BBH compared to EvoPrompt) in metrics relative to current state-of-the-art approaches, thereby establishing itself as one of the most effective solutions in evolutionary algorithm-based autoprompting.

Keywords: LLM, autoprompting, evolutionary algorithms, reflective evolution, prompt engineering

References

1. Kadavath S., Conerly T., Askell A., Henighan T., Drain D., Perez E., et al. Language models (mostly) know what they know. arXiv, 2022, arXiv:2207.05221. https://doi.org/10.48550/arXiv.2207.05221

2. Wei J., Bosma M., Zhao V.Y., Guu K., Yu A.W., Lester B., et al. Finetuned language models are zero-shot learners. arXiv, 2021, arXiv:2109.01652. https://doi.org/10.48550/arXiv.2109.01652

3. Liu P., Yuan W., Fu J., Jiang Z., Hayashi H., Neubig G. Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing. ACM Computing Surveys, 2023, vol. 55, no. 9, pp. 1–35. https://doi.org/10.1145/3560815

4. Brown T.B., Mann B., Ryder N., Subbiah M., Kaplan J., Dhariwal P., et al. Language models are few-shot learners. arXiv, 2020, arXiv:2005.14165. https://doi.org/10.48550/arXiv.2005.14165

5. Wang N., Peng Z., Que H., Liu J., Zhou W., Wu Y., et al. RoleLLM: benchmarking, eliciting, and enhancing role-playing abilities of large language models. Proc. of the Annual Meeting of the Association for Computational Linguistics, 2024, pp. 14743–14777. https://doi.org/10.18653/v1/2024.findings-acl.878

6. Wei J., Wang X., Schuurmans D., Bosma M., Ichter B., Xia F., et al. Chain-of-thought prompting elicits reasoning in large language models. arXiv, 2022, arXiv:2201.11903. https://doi.org/10.48550/arXiv.2201.11903

7. Wang L., Xu W., Lan Y., Hu Z., Lan Y., Lee R.K.-W., Lim E.-P. Plan-and-solve prompting: improving zero-shot chain-of-thought reasoning by large language models. Proc. of the 61^st Annual Meeting of the Association for Computational Linguistics, 2023, vol. 1, pp. 2609–2634. https://doi.org/10.18653/v1/2023.acl-long.147

8. Leidinger A., van Rooij R., Shutova E. The language of prompting: What linguistic properties make a prompt successful? Proc. of the Findings of the Association for Computational Linguistics: EMNLP, 2023, pp. 9210–9232. https://doi.org/10.18653/v1/2023.findings-emnlp.618

9. Shin T., Razeghi Y., Logan R.L., Wallace E., Singh S. AutoPrompt: Eliciting knowledge from language models with automatically generated prompts. Proc. of the Conference on Empirical Methods in Natural Language, 2020, pp. 4222–4235. https://doi.org/10.18653/v1/2020.emnlp-main.346

10. Kwon M., Kim G., Kim J., Lee H., Kim J. StablePrompt: automatic prompt tuning using reinforcement learning for large language models. arXiv, 2024, arXiv:2410.07652. https://doi.org/10.48550/arXiv.2410.07652

11. Guo Q., Wang R., Guo J., Li B., Song K., Tan X., et al. EvoPrompt: Connecting large language models with evolutionary algorithms yields powerful prompt optimizers. arXiv, 2023, arXiv:2309.08532. https://doi.org/10.48550/arXiv.2309.08532

12. Prasad A., Hase P., Zhou X., Bansal M. GrIPS: gradient-free, edit-based instruction search for prompting large language models. Proc. of the 17^th Conference of the European Chapter of the Association for Computational Linguistics, 2023, pp. 3845–3864. https://doi.org/10.18653/v1/2023.eacl-main.277

13. Schulhoff S., Ilie M., Balepur N., Kahadze K., Liu A., Si C., et al. The prompt report: a systematic survey of prompt engineering techniques. arXiv, 2024, arXiv:2406.06608. https://doi.org/10.48550/arXiv.2406.06608

14. Liu P., Yuan W., Fu J., Jiang Z., Hayashi H., Neubig G. Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing. ACM Computing Surveys, 2023, vol. 55, no. 9, pp. 1–35. https://doi.org/10.1145/3560815

15. Li Y.B., Wu K. Spell: semantic prompt evolution based on a LLM. arXiv, 2023, arXiv:2310.01260. https://doi.org/10.48550/arXiv.2310.01260

16. Pan R., Xing S., Diao S., Sun W., Liu X., Shum K., et al. Plum: prompt learning using metaheuristic. arXiv, 2023, arXiv:2311.08364. https://doi.org/10.48550/arXiv.2311.08364

17. Fernando C., Banarse D., Michalewski H., Osindero S., Rocktäschel T. Promptbreeder: self-referential self-improvement via prompt evolution. arXiv, 2023, arXiv:2309.16797. https://doi.org/10.48550/arXiv.2309.16797

18. Eiben A.E., Smith J.E. Introduction to Evolutionary Computing. Springer, 2015, 287 p.

19. Ye H., Wang J., Cao Z., Berto F., Hua C., Kim H., et al. ReEvo: large language models as hyper-heuristics with reflective evolution. arXiv, 2024, arXiv:2402.01145. https://doi.org/10.48550/arXiv.2402.01145

20. Holland J.H. Genetic algorithms. Scientific American, 1992, vol. 267, no. 1, pp. 66–72. https://doi.org/10.1038/scientificamerican0792-66

21. Lipowski A., Lipowska D. Roulette-wheel selection via stochastic acceptance. Physica A: Statistical Mechanics and its Applications, 2012, vol. 391, no. 6, pp. 2193–2196. https://doi.org/10.1016/j.physa.2011.12.004

22. Storn R., Price K. Differential evolution–a simple and efficient heuristic for global optimization over continuous spaces. Journal of Global Optimization, 1997, vol. 11, no. 4, pp. 341–359. https://doi.org/10.1023/a:1008202821328

23. Russell S., Norvig P. Artificial Intelligence: a Modern Approach. Pearson, 2009, 1152 p.

24. Kirkpatrick S., Gelatt C.D., Vecchi M.P. Optimization by simulated annealing. Science, 1983, vol. 220, no. 4598, pp. 671–680. https://doi.org/10.1126/science.220.4598.671

25. Glover F. Future paths for integer programming and links to artificial intelligence. Computers and Operations Research, 1986, vol. 13, no. 5, pp. 533–549. https://doi.org/10.1016/0305-0548(86)90048-1

26. Geem Z.W., Kim J.H., Loganathan G.V. A new heuristic optimization algorithm: harmony search. Simulation, 2001, vol. 76, no. 2, pp. 60–68. https://doi.org/10.1177/003754970107600201

27. Larranaga P. A review on estimation of distribution algorithms. Genetic Algorithms and Evolutionary Computation, 2002, vol. 2, pp. 57–100. https://doi.org/10.1007/978-1-4615-1539-5_3

28. Ross B.J. A Lamarckian evolution strategy for genetic algorithms. Practical Handbook of Genetic Algorithms, 2019, pp. 1–16. https://doi.org/10.1201/9780429128356-1

29. Voudouris C., Tsang E.P., Alsheddy A. Guided local search. International Series in Operations Research & Management Science, 2010, vol. 146, pp. 321–361. https://doi.org/10.1007/978-1-4419-1665-5_11

30. Dorigo M., Maniezzo V., Colorni A. Ant system: Optimization by a colony of cooperating agents. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 1996, vol. 26, no. 1, pp. 29–41. https://doi.org/10.1109/3477.484436

31. Shibasaka K., Kanazawa K., Yasunaga M. Decoupling-capacitor allocation problem solved by genetic algorithm. Proc. of the IEEE Electrical Design of Advanced Packaging Systems Symposium (EDAPS), 2013, pp. 225–228. https://doi.org/10.1109/edaps.2013.6724430

32. Kim H., Kim M., Berto F., Kim J., Park J. DevFormer: a symmetric transformer for context-aware device placement. arXiv, 2022, arXiv:2205.13225. https://doi.org/10.48550/arXiv.2205.13225

33. Gohil A., Tayal M., Sahu T., Sawalpurkar V. Travelling salesman problem: parallel implementations & analysis. arXiv, 2022, arXiv:2205.14352. https://doi.org/10.48550/arXiv.2205.14352

34. Liu M.X., Liu F., Fiannaca A.J., Koo T., Dixon L., Terry M., Cai C.J. "We Need Structured Output": towards user-centered constraints on large language model output // Proc. of the Extended Abstracts of the CHI Conference on Human Factors in Computing Systems, 2024, pp. 1–9. https://doi.org/10.1145/3613905.3650756

35. Kamath A., Ferret J., Pathak S., Vieillard N., Merhej R., Perrin S., et al. Gemma 3 technical report. arXiv, 2025, arXiv:2503.19786. https://doi.org/10.48550/arXiv.2503.19786

36. Agrawal L.A., Tan S., Soylu D., Ziems N., Khare R., Opsahl-Ong K., et al. GEPA: reflective prompt evolution can outperform reinforcement learning. arXiv, 2025, arXiv:2507.19457. https://doi.org/10.48550/arXiv.2507.19457

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License