Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aiter readme #400

Merged
merged 5 commits into from
Feb 3, 2025
Merged
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 26 additions & 0 deletions docs/dev-docker/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -434,6 +434,21 @@ python /app/vllm/benchmarks/benchmark_latency.py --model amd/Llama-3.1-405B-Inst

You should see some performance improvement about the e2e latency.

### AITER

To get [AITER](https://github.com/ROCm/aiter) kernels support, follow the [Docker build steps](#Docker-manifest) using the [aiter_intergration_final](https://github.com/ROCm/vllm/tree/aiter_intergration_final) branch
There is a published release candidate image at `rocm/vllm-dev:nightly_aiter_intergration_final_20250130`

The feature is controlled by using the following environment variables:

```bash
VLLM_USE_AITER # Toggle the feature as a whole. When off, none of the following are enabled. Off by default, unless the docker image above is used.
VLLM_USE_AITER_MOE # Enable MoE AITER kernel. On by default.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

VLLM_USE_AITER_PAGED_ATTN # Enable AITER Paged Attention kernel. Off by default.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

VLLM_USE_AITER_LINEAR # Enable AITER GEMM kernels. On by default.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

VLLM_USE_AITER_NORM # Enable AITER RMS Norm kernel. On by default.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

```

## MMLU_PRO_Biology Accuracy Evaluation

### FP16
Expand Down Expand Up @@ -470,3 +485,14 @@ To reproduce the release docker:
git checkout 8e87b08c2a284c1a20eb3d8e0fbdc84918bf27dc
docker build -f Dockerfile.rocm -t <your_tag> --build-arg BUILD_HIPBLASLT=1 --build-arg USE_CYTHON=1 .
```

### AITER

Use Aiter release candidate branch instead:

```bash
git clone https://github.com/ROCm/vllm.git
cd vllm
git checkout aiter_intergration_final
docker build -f Dockerfile.rocm -t <your_tag> --build-arg BUILD_HIPBLASLT=1 --build-arg USE_CYTHON=1 .
```