Abstract
Multiple precision multiplication is widely used in scientific computing and cryptography. When the size of integer grows beyond computer precision (32-bit or 64-bit), the computational cost of multiplication becomes significant. In this paper, we proposed a novel solution to implement multiple precision multiplication in massively parallel GPU with Kepler architecture. Our implementation is designed based on Chinese Remainder Theorem and Number Theoretic Transform with 64-bit prime. We implemented three versions of multiple precision multiplication which utilized global memory, shared memory and registers to store the precomputed twiddle factors. The register version use warp shuffle instruction (available in GPU with Kepler architecture) to exchange data among threads within the same warp. Thist echnique is able to avoid bank conflict issue in shared memory and allow faster computation in GPU. To the best of our knowledge, this is the first implementation reported in the literature that utilized warp shuffle instruction to accelerate NTT computation. Our best implementation is able to perform 1024-bit, 2048-bit, 4096-bit and 8192-bit multiplication in 0.095ms, 0.169ms, 0.444ms and 1.113ms respectively.
Original language | English |
---|---|
Title of host publication | Proceedings - 18th IEEE International Conference on High Performance Computing and Communications, 14th IEEE International Conference on Smart City and 2nd IEEE International Conference on Data Science and Systems, HPCC/SmartCity/DSS 2016 |
Editors | Laurence T. Yang, Jinjun Chen |
Publisher | IEEE, Institute of Electrical and Electronics Engineers |
Pages | 844-851 |
Number of pages | 8 |
ISBN (Electronic) | 9781509042968 |
DOIs | |
Publication status | Published - 2016 |
Externally published | Yes |
Event | IEEE International Conference on High Performance Computing and Communications 2016 - Sydney, Australia Duration: 12 Dec 2016 → 14 Dec 2016 Conference number: 18th https://ieeexplore.ieee.org/xpl/conhome/7819725/proceeding (Proceedings) |
Conference
Conference | IEEE International Conference on High Performance Computing and Communications 2016 |
---|---|
Abbreviated title | HPCC 2016 |
Country/Territory | Australia |
City | Sydney |
Period | 12/12/16 → 14/12/16 |
Internet address |
Keywords
- Fast Fourier Transform
- GPU
- Multiple Precision Multiplication
- Number Theoretic Transform