Nikiforov
Vladimir O.
D.Sc., Prof.
doi: 10.17586/2226-1494-2025-25-4-676-683
Font generation based on style and character structure analysis using diffusion models
Read the full article
For citation:
Maslov M.I., Avdyushina A.E., Solodkaya M.A., Kugaevskikh A.V. Font generation based on style and character structure analysis using diffusion models. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2025, vol. 25, no. 4, pp. 676–683 (in Russian). doi: 10.17586/2226-1494-2025-25-4-676-683
Abstract
The article discusses the role of generative neural networks in the development and optimization of fonts which play a key role in creating aesthetically attractive and functional designs. The main attention is paid to licensing restrictions and insufficient availability of fonts for various world languages, which creates difficulties for designers and typographers in the process of creating text materials. The novelty of the approach lies in the use of the diffusion model as a generative neural network for automatic font creation, including missing glyphs for languages not supported by standard fonts. To solve the tasks set, a diffusion model has been developed which is an algorithm for generating fonts based on the analysis of patterns in the structure of symbols and the logic of their construction. The model is integrated into an application that automates the process of creating font layouts, allowing users to generate new glyphs and fonts tailored to specific language needs. This technique includes preliminary data preparation, network training, and subsequent character generation that mimic the style and composition of the original fonts. During the experiments, the diffusion model demonstrated a high ability to generate high-quality font characters visually similar to the original samples. Font sets with a limited set of characters were used as source data, which allowed us to evaluate the capabilities of the model to create missing glyphs for various languages. The results showed that the developed model successfully reproduces the stylistic features of the original font, which confirms its potential for application in the development of font solutions for global use. The proposed method of font generation is of interest to specialists working in the field of design, typography, and the creation of text materials for various language audiences. The results obtained can be useful when creating fonts intended for use in multilingual projects that require the presence of missing characters.
Acknowledgements. The work was carried out within the framework of the State Assignment (project No. FSER-2025-0004).
References
- Ronneberger O., Fischer P., Brox T. U-net: Convolutional networks for biomedical image segmentation. Lecture Notes in Computer Science, 2015, vol. 9352, pp. 234–241. https://doi.org/10.1007/978-3-319-24574-4_28
- Wang Y., Lian Z. DeepVecFont: synthesizing high-quality vector fonts via dual-modality learning. ACM Transactions on Graphics (TOG), 2021, vol. 40, no. 6, pp. 1–15. https://doi.org/10.1145/3478513.3480488
- Wang Y., Wang Y., Yu L., Zhu Y., Lian Z. DeepVecFont-v2: Exploiting Transformers to Synthesize Vector Fonts with Higher Quality. Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023. pp. 18320–18328. https://doi.org/10.1109/CVPR52729.2023.01757
- Yang Z., Peng D., Kong Y., Zhang Y., Yao C., Jin L. FontDiffuser: One-shot font generation via denoising diffusion with multi-scale content aggregation and style contrastive learning. Proc. of the AAAI Conference on Artificial Intelligence, 2024. V. 38. N 7. P. 6603–6611. https://doi.org/10.1609/aaai.v38i7.28482
- Huang Q., Fu B., Zhang A., Qiao Y. GenText: Unsupervised artistic text generation via decoupled font and texture manipulation. arXiv, 2022, arXiv:2207.09649. https://doi.org/10.48550/arXiv.2207.09649
- Zeng J., Chen Q., Liu Y., Wang M., Yao Y. StrokeGAN: Reducing mode collapse in Chinese font generation via stroke encoding. arXiv, 2020, arXiv:2012.08687. https://doi.org/10.48550/arXiv.2012.08687
- Park S., Chun S., Cha J., Lee B., Shim H. Few-shot font generation with localized style representations and factorization. Proc. of the AAAI Conference on Artificial Intelligence, 2021, vol. 35, no. 3, pp. 2393–2402. https://doi.org/10.1609/aaai.v35i3.16340
- Yao M., Zhang Y., Lin X., Li X.; Zuo W. VQ-Font: Few-shot font generation with structure-aware enhancement and quantization. Proc. of the AAAI Conference on Artificial Intelligence, 2024, vol. 38, no. 15, pp. 16407–16415. https://doi.org/10.1609/aaai.v38i15.29577
- Ding M. An edge-directed diffusion equation-based image restoration approach for font generation. IEEE Access, 2023, vol. 11, pp. 141435–141444. https://doi.org/10.1109/ACCESS.2023.3342026
- Jeong J., Shin J. Multi-scale diffusion denoised smoothing. Proc. of the37th International Conference on Neural Information Processing Systems, 2023, pp. 67374–67397.
- Vaswani A., Shazeer N., Parmar N., Uszkoreit J., Jones L., Gomez A.N., Kaiser Ł., Polosukhin I.Attention is all you need. Advances in Neural Information Processing Systems, 2017, vol. 30, pp. 1–11.
- Voronov G., Lightheart R., Davison J., Krettler C.A., Healey D., Butler T. Multi-scale sinusoidal embeddings enable learning on high resolution mass spectrometry data. arXiv, 2022, arXiv:2207.02980. https://doi.org/10.48550/arXiv.2207.02980
- Dhariwal P., Nichol A. Diffusion models beat GANs on image synthesis. Advances in Neural Information Processing Systems, 2021, vol. 34, pp. 8780–8794.
- Convolutional Layer – Building Block of CNNs. Towards Data Science, 2024. Available at: https://towardsdatascience.com/convolutional-layer-building-block-of-cnns-501b5b643e7b (accessed: 30.01.2024).
- Xu M., Du X., Wang D. Super-resolution restoration of single vehicle image based on ESPCN-VISR model. IOP Conference Series: Materials Science and Engineering, 2020, vol. 790, no. 1, pp. 012107. https://doi.org/10.1088/1757-899X/790/1/012107
- Ho J., Jain A., Abbeel P. Denoising diffusion probabilistic models.Proc. of the 34th International Conference on Neural Information Processing Systems, 2020, pp. 6840-6851.
- Nichol A.Q., Dhariwal P. Improved denoising diffusion probabilistic models. Proc. of the38th International Conference on Machine Learning, 2021, vol. 139, pp. 8162–8171.
- Lin S., Yang X. Diffusion model with perceptual loss. arXiv, 2023, arXiv:2401.00110. https://doi.org/10.48550/arXiv.2401.00110

