Accelerating multiple precision multiplication in GPU with Kepler architecture

Boon-Chiao Chang, Bok-Min Goi, Raphael C.W. Phan, Wai-Kong Lee

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

4 Citations (Scopus)

Abstract

Multiple precision multiplication is widely used in scientific computing and cryptography. When the size of integer grows beyond computer precision (32-bit or 64-bit), the computational cost of multiplication becomes significant. In this paper, we proposed a novel solution to implement multiple precision multiplication in massively parallel GPU with Kepler architecture. Our implementation is designed based on Chinese Remainder Theorem and Number Theoretic Transform with 64-bit prime. We implemented three versions of multiple precision multiplication which utilized global memory, shared memory and registers to store the precomputed twiddle factors. The register version use warp shuffle instruction (available in GPU with Kepler architecture) to exchange data among threads within the same warp. Thist echnique is able to avoid bank conflict issue in shared memory and allow faster computation in GPU. To the best of our knowledge, this is the first implementation reported in the literature that utilized warp shuffle instruction to accelerate NTT computation. Our best implementation is able to perform 1024-bit, 2048-bit, 4096-bit and 8192-bit multiplication in 0.095ms, 0.169ms, 0.444ms and 1.113ms respectively.

Original languageEnglish
Title of host publicationProceedings - 18th IEEE International Conference on High Performance Computing and Communications, 14th IEEE International Conference on Smart City and 2nd IEEE International Conference on Data Science and Systems, HPCC/SmartCity/DSS 2016
EditorsLaurence T. Yang, Jinjun Chen
PublisherIEEE, Institute of Electrical and Electronics Engineers
Pages844-851
Number of pages8
ISBN (Electronic)9781509042968
DOIs
Publication statusPublished - 2016
Externally publishedYes
EventIEEE International Conference on High Performance Computing and Communications 2016 - Sydney, Australia
Duration: 12 Dec 201614 Dec 2016
Conference number: 18th
https://ieeexplore.ieee.org/xpl/conhome/7819725/proceeding (Proceedings)

Conference

ConferenceIEEE International Conference on High Performance Computing and Communications 2016
Abbreviated titleHPCC 2016
Country/TerritoryAustralia
CitySydney
Period12/12/1614/12/16
Internet address

Keywords

  • Fast Fourier Transform
  • GPU
  • Multiple Precision Multiplication
  • Number Theoretic Transform

Cite this