DOI: 10.17586/2226-1494-2018-18-4-646-653


V. A. Udaltsov, N. S. Karmanovskiy

Read the full article  ';
Article in Russian

For citation: Udaltsov V.A., Karmanovskiy N.S. Study of high-speed realization technics for elements of symmetric encryption algorithms during calculations on graphics processor. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2018, vol. 18, no. 4, pp. 646–653 (in Russian). doi: 10.17586/2226-1494-2018-18-4-646-653


Subject of Research. The paper deals with the research of transformations used in up-to-date symmetric algorithms aimed at definition of the most high-speed ways of their realization on the graphics processor with the use of CUDA and OpenCL technologies. Method. To achieve this goal, we considered LSX and ARX structures of block algorithms on the example of the following ciphers: AES, «Kuznyechik», LEA, Rectangle, Simon and Speck. The main types of transformations were detected, which include: multiplication in Galois fields, the use of lookup tables, bitwise operations, long number addition and data exchange with the global memory as an integral part of the calculations on graphics devices. The variants of the implementation of these calculations were considered and synthetic tests were carried out to determine their execution time. Main Results. The best ways for implementation of these transformations were determined. When performing multiplication in Galois fields, if one of the multipliers is constant, the best time was shown by the method using the pre-calculation table. It was also found that the most effective in terms of speed is the storage of replacement tables in shared memory and the implementation of bitwise operations with the division of input data into 8-bit elements, as in the case of long numbers addition. The result approbation was carried out by CLEFIA algorithm realization.The encryption time of 1 GB of data was 1542 mc. This result is 16 times less than the encryption time on the general-purpose processor. The application of realization variants for studied transformations that show the worst time results during synthetic tests on graphics processors gives fourfold speed increase compared with the central processor. Practical Relevance. The study results are applicable for the speedy and efficient use of graphics processors in the implementation of existing encryption algorithms. The results can become the basis  for the development of new encryption algorithms with the use of graphics processors.

Keywords: CUDA, OpenCL, cryptographic transformations, symmetric algorithms, encryption acceleration

 1.     Tomoiaga D., Stratulat M.AES algorithm adapted on GPU using CUDA for small data and large data volume encryption.International Journal of Applied Mathematics and Informatics, 2011, vol. 5, pp. 71–81.
2.     Earanky K., Elmiligi H., Rahman M. GPU-acceleration of blowfish cryptographic algorithm. Proc. IEEE Pacific RIM Conference on Communications, Computers and Signal Processing. Victoria, Canada, 2015, pp. 507–512. doi: 10.1109/PACRIM.2015.7334889
3.     Fan W., Chen X., Li X. Parallelization of RSA algorithm based on compute unified device architecture. Proc. 9th IEEE Int. Conf. on Grid and Cooperative Computing. Nanjing, China, 2010, pp. 174–178 doi: 10.1109/GCC.2010.44
4.     Ishchukova E.A., Bogdanov K.I. Implementation of the encryption algorithm MAGMA using NVIDIA CUDA technology. Mezhdunarodnyi Zhurnal Prikladnykh i Fundamental'nykh Issledovanii, 2015, no. 12, pp. 789–793. (in Russian)
5.     Sizonenko A.B., Tkachenko D.A. Method of high-performance implementation of the cryptographic transformation algorithm GOST R 34.12-2015 "Kuznyechik" on massively parallel coprocessors. Proc. 26th Int. Conf. on Prospects for the Development of Information Technology. Novosibirsk, 2017, pp. 174–178. (in Russian)
6.     Keisuke I., Naoki N., Takakazu K. Acceleration of AES encryption on CUDA GPU. International Journal of Networking and Computing, 2012, vol. 2, no. 1, pp. 131–145. doi: 10.15803/ijnc.2.1_131
7.     Gibadullin R.F., Yakovlev A.S., Novikov A.A., Perukhin M.Yu. Acceleration of AES encryption on the NVIDIA CUDA hardware-software platform. Vestnik Kazanskogo Tekhnologicheskogo Universiteta, 2017, no. 12, pp. 97–103. (in Russian)
8.     Zhukov A.E. Lightweight cryptography. Part 1. Voprosy Kiberbezopasnosti, 2015, no. 1, pp. 26–43. (in Russian)
9.     Zhukov A.E. Lightweight cryptography. Part 2. Voprosy Kiberbezopasnosti, 2015, no. 2, pp. 2–10. (in Russian)
10.  Knezevic M., Nikov V., Rombouts P. Low-latency encryption – is “Lightweight = Light + Wait”? Lecture Notes in Computer Science, 2012, vol. 7428, pp. 426–446. doi: 10.1007/978-3-642-33027-8_25
11.  Bogdanov A., Knezevic M., Leander G., Toz D., Varici K., Verbauwhede I. Spongent: a lightweight hash function. Lecture Notes in Computer Science, 2011, vol. 6917,
pp. 312–325. doi: 10.1007/978-3-642-23951-9_21
12.  Usman M., Ahmed I., Aslam M.I., Khan S., Shah U.A. SIT: a lightweight encryption algorithm for secure internet of things. International Journal of Advanced Computer Science and Applications, 2017, vol. 8, no. 1, pp. 402–411. doi: 10.14569/IJACSA.2017.080151
13.  Borisenko N., Nguyen L. On implementation method of large size linear transformation. Proc. 9th Workshop on Current Trends in Cryptology. Kazan, Russia, 2015, pp. 183–195.
14.  Kazymyrov O.V., Kazymyrova V.N., Oliynykov R.V. A method for generation of high-nonlinear S-boxes based on gradient descent. Mathematical Aspects of Cryptography, 2014, vol. 5, pp. 71–78. doi: 10.4213/mvk118
15.  Biryukov A., Roy A., Velichkov V. Differential analysis of block ciphers SIMON and SPECK. Lecture Notes in Computer Science, 2015, vol. 8540, pp. 546–570. doi: 10.1007/978-3-662-46706-0_28
16.  Biryukov A., Perrin L. State of the Art in Lightweight Symmetric Cryptography. Available at: (accessed 24.05.2018).
17.  Beaulieu R., Shors R., Smith D., Treatman-Clark J., Weeks S., Wingers B. The SIMON and SPECK families of lightweight block ciphers. Proc. 52nd ACM/EDAC/IEEE design Automation Conference. San Francisco, USA, 2015, pp. 1–10. doi: 10.1145/2744769.2747946
18.  Beaulieu R., Shors D., Smith J., Treatman-Clark S., Weeks B., Wingers L. SIMON and SPECK: block ciphers for the internet of things. Proc. NIST Lightweight Cryptography Workshop. Gaithersburg, USA, 2015, pp. 1–15.
19.  Hong D., Lee J.K., Kim D.C., Kwon D., Ryu K.H., Lee D.G. LEA: a 128-bit block cipher for fast encryption on common processors. Lecture Notes in Computer Science, 2014, vol. 8267, pp. 3–27. doi: 10.1007/978-3-319-05149-9_1
20.  Zhang WT., Bao ZZ., Lin DD., Rijmen V., Yang BH., Verbauwhede I. RECTANGLE: a bit-slice lightweight block cipher suitable for multiple platforms. Science China Information Sciences, 2015,vol. 58, pp. 1–15. doi: 10.1007/s11432-015-5459-7
21.  Hoshino T., Maruyama N., Matsuoka S., Takaki R. CUDA vs OpenACC: performance case studies with kernel benchmarks and a memory-bound CFD application. Proc. 13th IEEE/ACM Int. Symposium on Cluster, Cloud and Grid Computing. Delft, Netherlands, 2013,pp. 136–143. doi: 10.1109/CCGrid.2013.12
22.  Udaltsov V.A., Pavlov V.E. Operation speed increase of the cryptoalgorithm “Kuznyechik” with the use of CUDA technology. Teoriya, Praktika, Innovatsii, 2017, no. 4, pp. 5–11. (in Russian)
23.  Jullien G.A., Bajard J., Imbert L. Parallel Montgomery multiplication in GF (2^k) using trinomial residue arithmetic. Proc. 17th IEEE Symposium on Computer Arithmetic. Massachusetts, USA, 2005, pp. 164–171. doi:10.1109/ARITH.2005.34
24.  Rahman P.A. Effective computational schemes for the arithmetic of Galois field GF(28) in the error-correcting coding technology. Mezhdunarodnyi Zhurnal Prikladnykh i Fundamental'nykh Issledovanii, 2016, no. 7, pp. 360–365. (in Russian)
25.  Fomichev V.M., Lolich D.M., Yuzbashev A.V. S-boxes algorithmic realization based on modified additive generators. Prikladnaya diskretnaya matematika. Prilozhenie, 2017, no. 10, pp. 102–104. (in Russian) doi: 10.17223/2226308X/10/41
Biryukov A., Perrin L., Udovenko A. Reverse-engineering the S-box of Streebog, Kuznyechik and STRIBOBr1. Lecture Notes in Computer Science, 2016, vol. 9665, pp. 372–402. doi: 10.1007/978-3-662-49890-3_15

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License
Copyright 2001-2020 ©
Scientific and Technical Journal
of Information Technologies, Mechanics and Optics.
All rights reserved.