🚀 HEonGPU - A GPU Based Homomorphic Encryption Library

🚨 New Application: Private Information Retrieval on GPU (PIRonGPU)

PIRonGPU is a high-performance library that enhances secure data retrieval through Private Information Retrieval (PIR) on GPUs. By modifying the SealPIR protocol with HEonGPU, it achieves rapid, confidential querying, offering an efficient and scalable solution for privacy-sensitive applications.

🚨 New Feature: Logic Operation and 3 More CKKS Bootstrapping Types

HEonGPU now provides comprehensive support for logic operations across both the BFV and CKKS encryption schemes. In addition, the latest update introduces three new CKKS Bootstrapping types; two of which leverage Bit Bootstrapping and Gate Bootstrapping techniques, while the third employs Slim Bootstrapping, a method that is significantly more efficient than Regular Bootstrapping. These enhancements not only broaden HEonGPU’s functionality but also significantly improve its performance in managing noise and enabling efficient, secure computations on GPU platforms.

The Logic Operations supported:

NOT, AND, NAND, OR, NOR, XOR, XNOR

3 More CKKS Bootstrapping Types:

Slim Bootstrapping (supports only Real Numbers)
Bit Bootstrapping (supports only Binary Numbers)
Gate Bootstrapping (supports only Binary Numbers)
- AND, NAND, OR, NOR, XOR, XNOR

Execution times of the HEonGPU Bootstrapping Operations (on RTX 4090)

Bootstrapping Type	N	Slot Count	LKM	Remaining Level	Total Time	Amortized Time
Slim Bootstrapping	2^16	2^15	ON	0 Level	99.12 ms	3.02 µs
	2^16	2^15	ON	2 Level	114.13 ms	3.48 µs
	2^16	2^15	ON	4 Level	164.20 ms	5.01 µs
Bit Bootstrapping	2^15	2^14	OFF	0 Level	33.74 ms	2.06 µs
	2^15	2^14	OFF	2 Level	39.36 ms	2.40 µs
	2^15	2^14	OFF	4 Level	46.54 ms	2.84 µs
	2^15	2^14	OFF	6 Level	55.66 ms	3.40 µs
	2^16	2^15	OFF	0 Level	86.69 ms	2.73 µs
	2^16	2^15	OFF	2 Level	100.72 ms	3.07 µs
	2^16	2^15	OFF	4 Level	115.88 ms	3.53 µs
Gate Bootstrapping*	2^15	2^14	OFF	0 Level	27.03 ms	1.64 µs
	2^16	2^15	OFF	0 Level	70.73 ms	2.16 µs

LKM: Less Key Mode is a bootstrapping optimization in HEonGPU. Its purpose is to reduce the required amount of Galois keys by 30% while sacrificing 15–20% performance. This is useful in cases where GPU memory is insufficient.

*: For all gates

🚨 New Feature: CKKS Regular Bootstrapping

HEonGPU now includes support for CKKS Regular Bootstrapping, enabling efficient evaluation of deep computational circuits with high precision and security. On an NVIDIA RTX 4090, it performs CKKS Regular Bootstrapping for N=65536 in under 170 ms.

🚨 New Feature: Multiparty Computation (MPC) Support

HEonGPU now includes support for Multiparty Computation (MPC) protocols, providing a secure and collaborative framework for encrypted computations. By incorporating Multiparty Homomorphic Encryption (MHE) capabilities, the library enables distributed computations with threshold encryption models such as N-out-of-N. The implementation is fully optimized for GPU environments, delivering minimal latency and maximum performance in collaborative settings.

HEonGPU is a high-performance library designed to optimize Fully Homomorphic Encryption (FHE) operations on GPUs. By leveraging the parallel processing power of GPUs, it significantly reduces the computational load of FHE through concurrent execution of complex operations. Its multi-stream architecture enables efficient parallel processing and minimizes the overhead of data transfers between the CPU and GPU. These features make HEonGPU ideal for large-scale encrypted computations, offering reduced latency and improved performance.

The goal of HEonGPU is to provide:

A high-performance framework for executing FHE schemes, specifically BFV and CKKS, by leveraging the parallel processing capabilities of CUDA.
A user-friendly C++ interface that requires no prior knowledge of GPU programming, with all CUDA kernels encapsulated in easy-to-use classes.
An optimized multi-stream architecture that ensures efficient memory management and concurrent execution of encrypted computations on the GPU.

For more information about HEonGPU: https://eprint.iacr.org/2024/1543

Current HEonGPU Capabilities and Schemes

Capability / Scheme	HEonGPU
BFV	✓
CKKS	✓
BGV	Soon
TFHE	Very Soon
CKKS Regular Bootstrapping	✓
CKKS Slim Bootstrapping	✓
CKKS Bit Bootstrapping	✓
CKKS Gate Bootstrapping	✓
Multiparty Computation (MPC)	✓

Installation

Requirements

CMake >=3.26.4
GCC
GMP
CUDA Toolkit >=11.4

Third-Party Dependencies

HEonGPU automatically handle third-party dependencies like GPU-NTT, GPU-FFT, RMM, Thrust, GoogleTest.

Build & Install

To build and install HEonGPU, follow the steps below. This includes configuring the project using CMake, compiling the source code, and installing the library on your system.

GPU Architecture	Compute Capability (CMAKE_CUDA_ARCHITECTURES Value)
Volta	70, 72
Turing	75
Ampere	80, 86
Ada	89, 90

$ cmake -S . -D CMAKE_CUDA_ARCHITECTURES=89 -B build
$ cmake --build ./build/
$ sudo cmake --install build

Testing & Benchmarking

To run tests:

$ cmake -S . -D HEonGPU_BUILD_TESTS=ON -D CMAKE_CUDA_ARCHITECTURES=89 -B build
$ cmake --build ./build/

$ ./build/bin/test/<...>
$ Example: ./build/bin/test/bfv_addition_testcases

To run benchmarks:

$ cmake -S . -D HEonGPU_BUILD_BENCHMARKS=ON -D CMAKE_CUDA_ARCHITECTURES=89 -B build
$ cmake --build ./build/

$ ./build/bin/benchmark/<...>
$ Example: ./build/bin/benchmark/bfv_benchmark

Examples

To run examples:

$ cmake -S . -D HEonGPU_BUILD_EXAMPLES=ON -D CMAKE_CUDA_ARCHITECTURES=89 -B build
$ cmake --build ./build/

$ ./build/bin/examples/<...>
$ Example: ./build/bin/examples/1_basic_bfv

Toy Example

#include "heongpu.cuh"

int main() {
    heongpu::Parameters context(heongpu::scheme_type::bfv,
            heongpu::keyswitching_type::KEYSWITCHING_METHOD_I);

    size_t poly_modulus_degree = 8192;
    context.set_poly_modulus_degree(poly_modulus_degree);
    context.set_default_coeff_modulus(1);
    int plain_modulus = 1032193;
    context.set_plain_modulus(plain_modulus);
    context.generate();

    heongpu::HEKeyGenerator keygen(context);
    heongpu::Secretkey secret_key(context);
    keygen.generate_secret_key(secret_key);

    heongpu::Publickey public_key(context);
    keygen.generate_public_key(public_key, secret_key);

    heongpu::HEEncoder encoder(context);
    heongpu::HEEncryptor encryptor(context, public_key);
    heongpu::HEDecryptor decryptor(context, secret_key);
    heongpu::HEOperator operators(context);

    std::vector<uint64_t> message(poly_modulus_degree, 8ULL);
    heongpu::Plaintext P1(context);
    encoder.encode(P1, message);

    heongpu::Ciphertext C1(context);
    encryptor.encrypt(C1, P1);

    operators.add_inplace(C1, C1);

    heongpu::Plaintext P2(context);
    decryptor.decrypt(P2, C1);

    std::vector<uint64_t> result;
    encoder.decode(result, P2);

    return 0;
}

Configuration Header (define.h)

The define.h file is an essential configuration file for HEonGPU, containing key settings that define the library's limits and capabilities, including polynomial degrees, modulus bit-lengths, and memory pool sizes.

Features in define.h:

Polynomial Degree: MAX_POLY_DEGREE (65536) and MIN_POLY_DEGREE (4096) define the range for polynomial degrees used in FHE.
Modulus Bit-Length: MAX_USER_DEFINED_MOD_BIT_COUNT (60), MIN_USER_DEFINED_MOD_BIT_COUNT (30), MAX_MOD_BIT_COUNT (61), MIN_MOD_BIT_COUNT (30) specify valid bit-lengths for user-defined and general modulus values.
Maximum BFV Auxiliary Base Count: MAX_BSK_SIZE (64) sets the maximum size for the BFV auxiliary base.
Galois Key Capability: MAX_SHIFT (8) controls the maximum rotation capability for Galois keys. Don't forget to change it if you need more rotation steps(for default galois key generation).
Device:
- Initial size (0.5): 50% of GPU memory.
- Max size (0.8): 80% of GPU memory.
Host:
- Initial size (0.1): 10% of CPU memory.
- Max size (0.2): 20% of CPU memory.

If your system allows, you can redefine the memory pool sizes to better suit your use case.

Storage Management

In HEonGPU, Fully Homomorphic Encryption (FHE) operations rely on an efficient storage management system that ensures data can seamlessly transition between host (CPU) and device (GPU) memory. The Storage Manager module enables this flexibility, allowing data to be stored and accessed as needed on either the host or device.

How Storage Management Works

The ExecutionOptions struct in HEonGPU determines how input data is handled during and after GPU computations. Specifically:

set_storage_type: Defines where the output data should reside after the operation (DEVICE or HOST).
set_initial_location: Determines whether the input data should return to its original location after the computation.

// ...
ExecutionOptions options;
options.set_storage_type(storage_type::DEVICE)
       .set_initial_location(true);

operators.add(input1, input2, output, options);
// ...

Simple case:
- If input1 and input2 are initially on the HOST, they will be copied to the DEVICE for the computation.
- After the operation, since set_initial_location(true) is specified, both input1 and input2 will return to the HOST.
- The output will remain on the DEVICE, as specified by set_storage_type(storage_type::DEVICE).
All case:

`set_storage_type`	`set_initial_location`	Input Locations (Before)	Input Locations (After)	Output Location After Computation
`DEVICE`	`true`	`input1: HOST, input2: HOST`	`input1: HOST, input2: HOST`	`DEVICE`
`DEVICE`	`false`	`input1: HOST, input2: HOST`	`input1: DEVICE, input2: DEVICE`	`DEVICE`
`DEVICE`	`true`	`input1: HOST, input2: DEVICE`	`input1: HOST, input2: DEVICE`	`DEVICE`
`DEVICE`	`false`	`input1: HOST, input2: DEVICE`	`input1: DEVICE, input2: DEVICE`	`DEVICE`
`DEVICE`	`true`	`input1: DEVICE, input2: DEVICE`	`input1: DEVICE, input2: DEVICE`	`DEVICE`
`DEVICE`	`false`	`input1: DEVICE, input2: DEVICE`	`input1: DEVICE, input2: DEVICE`	`DEVICE`
`HOST`	`true`	`input1: HOST, input2: HOST`	`input1: HOST, input2: HOST`	`HOST`
`HOST`	`false`	`input1: HOST, input2: HOST`	`input1: DEVICE, input2: DEVICE`	`HOST`
`HOST`	`true`	`input1: HOST, input2: DEVICE`	`input1: HOST, input2: HOST`	`HOST`
`HOST`	`false`	`input1: HOST, input2: DEVICE`	`input1: DEVICE, input2: DEVICE`	`HOST`
`HOST`	`true`	`input1: DEVICE, input2: DEVICE`	`input1: DEVICE, input2: DEVICE`	`HOST`
`HOST`	`false`	`input1: DEVICE, input2: DEVICE`	`input1: DEVICE, input2: DEVICE`	`HOST`

Using HEonGPU in a downstream CMake project

Make sure HEonGPU is installed before integrating it into your project. The installed HEonGPU library provides a set of config files that make it easy to integrate HEonGPU into your own CMake project. In your CMakeLists.txt, simply add:

project(<your-project> LANGUAGES CXX CUDA)
find_package(CUDAToolkit REQUIRED)
# ...
find_package(HEonGPU)
# ...
target_link_libraries(<your-target> (PRIVATE|PUBLIC|INTERFACE) HEonGPU::heongpu CUDA::cudart)
# ...
set_target_properties(<your-target> PROPERTIES CUDA_SEPARABLE_COMPILATION ON)
# ...

How to Cite HEonGPU

Please use the below BibTeX, to cite HEonGPU in academic papers.

@misc{cryptoeprint:2024/1543,
      author = {Ali Şah Özcan and Erkay Savaş},
      title = {{HEonGPU}: a {GPU}-based Fully Homomorphic Encryption Library 1.0},
      howpublished = {Cryptology {ePrint} Archive, Paper 2024/1543},
      year = {2024},
      url = {https://eprint.iacr.org/2024/1543}
}

Please use the below BibTeX, to cite key-switching optimizations in academic papers.

@misc{cryptoeprint:2025/124,
      author = {Ali Şah Özcan and Erkay Savaş},
      title = {{GPU} Implementations of Three Different Key-Switching Methods for Homomorphic Encryption Schemes},
      howpublished = {Cryptology {ePrint} Archive, Paper 2025/124},
      year = {2025},
      url = {https://eprint.iacr.org/2025/124}
}

License

This project is licensed under the Apache License. For more details, please refer to the License file.

Contact

If you have any questions or feedback, feel free to contact me:

Email: [email protected]
LinkedIn: Profile

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
benchmark		benchmark
cmake		cmake
dev-tools		dev-tools
example		example
src		src
test		test
thirdparty		thirdparty
.clang-format		.clang-format
.gitignore		.gitignore
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
multiparty_computation.svg		multiparty_computation.svg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🚀 HEonGPU - A GPU Based Homomorphic Encryption Library

🚨 New Application: Private Information Retrieval on GPU (PIRonGPU)

🚨 New Feature: Logic Operation and 3 More CKKS Bootstrapping Types

Execution times of the HEonGPU Bootstrapping Operations (on RTX 4090)

🚨 New Feature: CKKS Regular Bootstrapping

🚨 New Feature: Multiparty Computation (MPC) Support

Current HEonGPU Capabilities and Schemes

Installation

Requirements

Third-Party Dependencies

Build & Install

Testing & Benchmarking

Examples

Toy Example

Configuration Header (define.h)

Storage Management

How Storage Management Works

Using HEonGPU in a downstream CMake project

How to Cite HEonGPU

License

Contact

About

Releases

Packages

Languages

License

AdrianHeath988/adrianHEonGPU

Folders and files

Latest commit

History

Repository files navigation

🚀 HEonGPU - A GPU Based Homomorphic Encryption Library

🚨 New Application: Private Information Retrieval on GPU (PIRonGPU)

🚨 New Feature: Logic Operation and 3 More CKKS Bootstrapping Types

Execution times of the HEonGPU Bootstrapping Operations (on RTX 4090)

🚨 New Feature: CKKS Regular Bootstrapping

🚨 New Feature: Multiparty Computation (MPC) Support

Current HEonGPU Capabilities and Schemes

Installation

Requirements

Third-Party Dependencies

Build & Install

Testing & Benchmarking

Examples

Toy Example

Configuration Header (define.h)

Storage Management

How Storage Management Works

Using HEonGPU in a downstream CMake project

How to Cite HEonGPU

License

Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages