TY - JOUR
T1 - Fast implementation of block ciphers and PRNGs in Maxwell GPU architecture
AU - Lee, Wai Kong
AU - Cheong, Hon Sang
AU - Phan, Raphael C.W.
AU - Goi, Bok Min
N1 - Funding Information:
This work was supported partially by Universiti Tunku Abdul Rahman Research Fund (UTARRF) under Grant IPSR/RMC/UTARRF/2012-C2/L04. We would also like to thank the all members in Accelerative Technology Lab, MIMOS, Malaysia for their great support. This research work is also partially supported by Ministry of Science, Technology and Innovation (MOSTI), Malaysia under Grant 01-02-11-SF0202.
Publisher Copyright:
© 2016, Springer Science+Business Media New York.
Copyright:
Copyright 2018 Elsevier B.V., All rights reserved.
PY - 2016/3
Y1 - 2016/3
N2 - GPU is widely used in various applications that require huge computational power. In this paper, we contribute to the cryptography and high performance computing research community by presenting techniques to accelerate symmetric block ciphers (AES-128, CAST-128, Camellia, SEED, IDEA, Blowfish and Threefish) in NVIDIA GTX 980 with Maxwell architecture. The proposed techniques consider various aspects of block cipher implementation in GPU, including the placement of encryption keys and T-box in memory, thread block size, cipher operating mode, parallel granularity and data copy between CPU and GPU. We proposed a new method to store the encryption keys in registers with high access speed and exchange it with other threads by using the warp shuffle operation in GPU. The block ciphers implemented in this paper operate in CTR mode, and able to achieve high encryption speed with 149 Gbps (AES-128), 143 Gbps (CAST-128), 124 Gbps (Camelia), 112 Gbps (SEED), 149 Gbps (IDEA), 111 Gbps (Blowfish) and 197 Gbps (Threefish). To the best of our knowledge, this is the first implementation of block ciphers that exploits warp shuffle, an advanced feature in NVIDIA GPU. On the other hand, block ciphers can be used as pseudorandom number generator (PRNG) when it is operating under counter mode (CTR), but the speed is usually slower compare to other PRNG using lighter operations. Hence, we attempt to modify IDEA and Blowfish in order to achieve faster PRNG generation. The modified IDEA and Blowfish manage to pass all NIST Statistical Test and TestU01 SmallCrush except the more stringent tests in TestU01 (Crush and BigCrush).
AB - GPU is widely used in various applications that require huge computational power. In this paper, we contribute to the cryptography and high performance computing research community by presenting techniques to accelerate symmetric block ciphers (AES-128, CAST-128, Camellia, SEED, IDEA, Blowfish and Threefish) in NVIDIA GTX 980 with Maxwell architecture. The proposed techniques consider various aspects of block cipher implementation in GPU, including the placement of encryption keys and T-box in memory, thread block size, cipher operating mode, parallel granularity and data copy between CPU and GPU. We proposed a new method to store the encryption keys in registers with high access speed and exchange it with other threads by using the warp shuffle operation in GPU. The block ciphers implemented in this paper operate in CTR mode, and able to achieve high encryption speed with 149 Gbps (AES-128), 143 Gbps (CAST-128), 124 Gbps (Camelia), 112 Gbps (SEED), 149 Gbps (IDEA), 111 Gbps (Blowfish) and 197 Gbps (Threefish). To the best of our knowledge, this is the first implementation of block ciphers that exploits warp shuffle, an advanced feature in NVIDIA GPU. On the other hand, block ciphers can be used as pseudorandom number generator (PRNG) when it is operating under counter mode (CTR), but the speed is usually slower compare to other PRNG using lighter operations. Hence, we attempt to modify IDEA and Blowfish in order to achieve faster PRNG generation. The modified IDEA and Blowfish manage to pass all NIST Statistical Test and TestU01 SmallCrush except the more stringent tests in TestU01 (Crush and BigCrush).
KW - Block cipher
KW - Counter mode
KW - CUDA
KW - GPU
KW - Network security
KW - PRNG
UR - http://www.scopus.com/inward/record.url?scp=84955246835&partnerID=8YFLogxK
U2 - 10.1007/s10586-016-0536-2
DO - 10.1007/s10586-016-0536-2
M3 - Article
AN - SCOPUS:84955246835
SN - 1386-7857
VL - 19
SP - 335
EP - 347
JO - Cluster Computing
JF - Cluster Computing
IS - 1
ER -