GPU transform for FCMAE pre-training #196

ziw-liu · 2024-10-23T17:01:56Z

Building on top of #196:
Doing non-trivial augmentations on large imaging volumes has become a bottleneck in training image translation models. By keeping the initial cropping and normalization on the CPU workers, while executing the more heavy transforms (especially the resampling ones) on the GPU, training can be significantly faster.

Current state of this PR

For 2D FCMAE, reaching the same validation loss is ~7x the speed compared to v0.2. Compare these logs:

/hpc/projects/comp.micro/virtual_staining/models/hek-a549-bj5a-20x/lightning_logs/tiny-2x2-fcmae-amp-hek-a549-bj5a-400ep
/hpc/projects/comp.micro/virtual_staining/models/fcmae-2d/test-logs/hek-a549-bj5a-gpu-aug-val-every-10

And for 3D FCMAE, the total voxel throughput is ~3x of v0.2, potentially limited by CPU-GPU transfer.

~~Caveat: since the transforms are now defined in the lightning module instead of the data module, they are the same for all the dataset.~~ (fixed)

edyoshikun

merging this so we can merge to base

ziw-liu added 2 commits October 23, 2024 10:00

data module that only crops and does not collate

023ca88

wip: execute transforms on the GPU

f7ce0ba

ziw-liu changed the base branch from main to simple-cache October 23, 2024 17:02

ziw-liu requested a review from edyoshikun October 23, 2024 17:02

ziw-liu added enhancement New feature or request translation Image translation (VS) labels Oct 23, 2024

ziw-liu added 19 commits October 28, 2024 20:01

fix randomness in inversion transform

55499de

add option to pop the normalization metadata

4280677

move gpu transform definition back to data module

1561802

add tiled crop transform for validation

2e37217

add stack channel transform for gpu augmentation

7edf36e

fix typing

eda5d1b

collate before sending to gpu

550101d

inherit gpu transforms for livecell dataset

92e3722

update fcmae engine to apply per-dataset augmentations

c185377

Merge branch 'simple-cache' into gpu-transform

70fcf1c

fix abc type hint

be0e94f

update docstring style

92c4b0a

disable grad for validation transforms

f7b585c

improve sample image logging in fcmae

42c49f5

fix dataset length when batch size is larger than the dataset

4bf1088

fix docstring

3276950

add option to disable normalization metadata

14a16ed

inherit gpu transform for ctmc

6719305

remove duplicate method overrride

fad3d4e

ziw-liu marked this pull request as ready for review October 31, 2024 22:19

update docstring for ctmc

07c1021

edyoshikun approved these changes Nov 8, 2024

View reviewed changes

ziw-liu merged commit 949c445 into simple-cache Nov 8, 2024
4 checks passed

ziw-liu deleted the gpu-transform branch November 8, 2024 17:58

ziw-liu mentioned this pull request Nov 12, 2024

Sharded distributed sampler for cached dataloading in DDP #195

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU transform for FCMAE pre-training #196

GPU transform for FCMAE pre-training #196

ziw-liu commented Oct 23, 2024 •

edited

Loading

edyoshikun left a comment •

edited

Loading

GPU transform for FCMAE pre-training #196

GPU transform for FCMAE pre-training #196

Conversation

ziw-liu commented Oct 23, 2024 • edited Loading

Current state of this PR

edyoshikun left a comment • edited Loading

Choose a reason for hiding this comment

ziw-liu commented Oct 23, 2024 •

edited

Loading

edyoshikun left a comment •

edited

Loading