Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Brax + rocm 6.2.4 #335

Open
Delaunay opened this issue Feb 25, 2025 · 1 comment
Open

Brax + rocm 6.2.4 #335

Delaunay opened this issue Feb 25, 2025 · 1 comment

Comments

@Delaunay
Copy link
Collaborator

rax.0 [start] /home/testroot/rocm/results/venv/torch/bin/voir --config /home/testroot/rocm/results/extra/brax/voirconf-brax.0-0efae956f1553a76c1e03985181900f5.json /home/testroot/milabench/benchmarks/brax/main.py --episode-length 20 --batch-size 1024 --num-minibatches 32 --num-envs 8192 [at 2025-02-25 17:16:40.478362]
brax.0 [stderr] :0:rocdevice.cpp            :2984: 159730272363 us: [pid:403868 tid:0x7f37b81ff640] Callback: Queue 0x7ef728200000 aborting with error : HSA_STATUS_ERROR_MEMORY_APERTURE_VIOLATION: The agent attempted to access memory beyond the largest legal address. code: 0x29
brax.0 [end (-6)] /home/testroot/rocm/results/venv/torch/bin/voir --config /home/testroot/rocm/results/extra/brax/voirconf-brax.0-0efae956f1553a76c1e03985181900f5.json /home/testroot/milabench/benchmarks/brax/main.py --episode-length 20 --batch-size 1024 --num-minibatches 32 --num-envs 8192 [at 2025-02-25 17:16:56.320569]
@Delaunay Delaunay added the rocm label Feb 25, 2025
@Delaunay
Copy link
Collaborator Author

brax.0 [start] /home/testroot/rocm/results/venv/torch/bin/voir --config /home/testroot/rocm/results/extra/brax/voirconf-brax.0-0efae956f1553a76c1e03985181900f5.json /home/testroot/milabench/benchmarks/brax/main.py --episode-length 20 --batch-size 512 --num-minibatches 32 --num-envs 1024 [at 2025-02-25 17:19:51.006861]
brax.0 [stderr] :0:rocdevice.cpp            :2984: 159920615215 us: [pid:406310 tid:0x7f540d9ff640] Callback: Queue 0x7f137c300000 aborting with error : HSA_STATUS_ERROR_MEMORY_APERTURE_VIOLATION: The agent attempted to access memory beyond the largest legal address. code: 0x29
brax.0 [end (-6)] /home/testroot/rocm/results/venv/torch/bin/voir --config /home/testroot/rocm/results/extra/brax/voirconf-brax.0-0efae956f1553a76c1e03985181900f5.json /home/testroot/milabench/benchmarks/brax/main.py --episode-length 20 --batch-size 512 --num-minibatches 32 --num-envs 1024 [at 2025-02-25 17:20:06.660588]

It does not even seem that the memory gets allocated in the first place

@Delaunay Delaunay added the jax label Feb 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant