TRILLI is a novel Versal-based accelerator for 3D rigid image registration. TRILLI is designed to address the computational challenges in both key components of the registration process, geometric transformation with interpolation and similarity metric computation, by optimally mapping computational steps to heterogeneous hardware components on the Versal VCK5000.
System Architecture Diagram: TRILLI integration with a CPU-based Powell optimizer for multi-modal 3D rigid image registration. Input images are used for an initial transformation and accelerated registration via TRILLI. The resulting MI is used by the Powell optimizer
to iteratively refine transformation parameters based on user-defined settings. The final output is a registered floating volume.
- Hardware Device: Versal VCK5000 XDMA2022.1
- Vitis 2022.1
- XRT 2022.1
- OpenCV 3.0.0 - static library
- Python 3.8
- GCC 7.3.1
3DIRG_application/
: complete registration frameworkaie/
: AI Engines source codecommon/
: constants and configuration generatordata_movers/
: PL kernels source codehw/
: system integration and output bitstreammutual_info/
: PL mutual information kernel source code from Hephaestussoa/
: GPU 3D Image Registration from athena, with scripts for simplifying testingsw/
: host source codedefault.cfg
: architecture configuration parameters
Following we describe three testing flows:
- Case 1: using the given bitstream to test image transformation and/or image registration step
- Case 2: building from scratch the desired bitstream for image transformation and/or image registration step
- Case 3: target the whole image registration application. This can be done either using the given bitstream for image registration step, or building it from scratch
Available bitstreams:
bitstreams/OnlyTX_32IPE.xclbin
for the image transformation only (TX)bitstreams/RegStep_32IPE.xclbin
for the image registration step and complete application (STEP)
The file bitstreams/config_DIM512_IPE32.cfg
contains the configuration used to build the bitstreams.
- Source Vitis & XRT
source <YOUR_PATH_TO_XRT>/setup.sh source <YOUR_PATH_TO_VITIS>/2022.1/settings64.sh
- Move into the root folder of this repository & build the transformation standalone bitstream
cd <your-path>/trilli
- Load the configuration used to build the bitstreams
make config CFG=bitstreams/config_DIM512_IPE32.cfg
- Compile the host code
make build_sw TASK=[TX|STEP]
- Pack the build into a single folder, ready for testing
This command generates the folder
make pack XCLBIN=<xclbin-path> [NAME=<name>]
build/<name>
(default name ishw_build
) which will contain the bitstream<xclbin-path>
(i.e.bitstreams/OnlyTX_32IPE.xclbin
), the host executable and the dataset. - Move the generated folder
build/<name>
(i.e.cd build/hw_build
) to the deploy machine - On the deploy machine, generate the dataset with
./generate_dataset.sh [dim] [depth]
. Considering the parameters used for the paper, the command is:./generate_dataset.sh 512 256
- Source XRT on the deploy machine
source <YOUR_PATH_TO_XRT>/setup.sh
- Run on the deploy machine with:
./host_overlay.exe [depth] [x] [y] [ang_degrees] [num_runs]
. Considering the parameters used for the paper, the command is: TODO fill in parameters./host_overlay.exe 512 18.54458648 -12.30391042 20.0 1
-
Source Vitis & XRT
source <YOUR_PATH_TO_XRT>/setup.sh source <YOUR_PATH_TO_VITIS>/2022.1/settings64.sh
-
Move into the root folder of this repository & build the transformation standalone bitstream
cd <your-path>/trilli
-
Edit the
default.cfg
file to detail the configuration desired.For Transformation, relevant parameters are:
DIMENSION := XYZ
- represents the image resolution DIMENSION x DIMENSION. Choices = [1,2,4,8,16]INT_PE := XY
- Number of Interpolation Processing Elements. Choices: [1,2,4,8,16,32]PIXELS_PER_READ := XYZ
- represents the port width. [32,64,128]
For the rigid step, instead:
DIMENSION := XY
- represents the image resolution DIMENSION x DIMENSION, 512 for the paperHIST_PE := XY
- Histogram Processing elements for Mutual Information. Choices = [1,2,4,8,16]EPE_PE := XY
- Histogram Processing elements for Mutual Information. Choices = [1,2,4,8,16]INT_PE := XY
- Number of Interpolation Processing Elements. Choices: [1,2,4,8,16,32]PIXELS_PER_READ := XYZ
- represents the port width. Choices: [32,64,128]
-
Prepare the folder to be moved on the deploy machine. (default name is
hw_build
)make build_and_pack TARGET=hw TASK=[TX|STEP] NAME=[NAME=<name>]
-
Move the generated folder,
build/NAME
(i.e.cd build/hw_build
), on the deploy machine -
Perform steps 7 and 8 from Case 1 on the deply machine, to generate the dataset and run.
-
Compile the software application. We remind the hard requirements of OpenCV 3.0.0 installed and statically compiled.
make build_app
-
Move the CT volume in
3DIRG_application/PET_small/png
and the PET volume in3DIRG_application/CT_small/png
-
Prepare the folder. (default name is
hw_build
)make pack_app [NAME=<name>]
Note 1: The commands prepare a folder copying dataset volumes from
3DIRG_application/CT_small/png
and3DIRG_application/PET_small/png
. Therefore, images must be there before using this command.Note 2: This command would copy a newly generated bitstream. To use the premade one, you need to manually copy it in the generated folder.
-
Move the folder on the deploy machine
-
Execute the application:
./exec.sh
- Follow CASE 1, selecting STEP as TASK
- Compile the software application. We remind the hard requirements of OpenCV 3.0.0 installed and statically compiled.
make build_app
- Move the CT volume in
3DIRG_application/PET_small/png
and the PET volume in3DIRG_application/CT_small/png
- Prepare the folder. (default name is
hw_build
)make pack_app [NAME=<name>]
Note 1: That the commands prepare a folder copying dataset volumes from 3DIRG_application/CT_small/png
and 3DIRG_application/PET_small/png
. Therefore, images must be there before using this command.
- Execute the application:
./exec.sh
To plot each result figure in the paper, please refer to the corresponding folder under paper_fig/. Each folder contains a subfolder with the figure name, and a dedicated readme for running. Per each figure, we provide some dedicated .csv files, containing sufficient numbers to replicate the paper result.
- Hephaestus - Mutual Information & CPU-FPGA 3D image registration
- Athena - GPU-based 3D image registration
- Vitis Libraries - WarpAffine3D kernel for image transformation
- ITK powell-based 3D image registration
- SimpleITK powell-based 3D image registration
If you find this repository useful, please use the following citation:
@inproceedings{sorrentino2025trilli,
title = {Soaring with TRILLI: an HW/SW Heterogeneous Accelerator for Multi-Modal Image Registration},
author = {Giuseppe, Sorrentino and Paolo S., Galfano and Eleonora, D'Arnese and Davide, Conficconi},
year = 2025,
booktitle={2025 IEEE 33rd Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)},
organization={IEEE}
}