Skip to content

Commit

Permalink
Automatic Prefix Caching - acc fix (#900)
Browse files Browse the repository at this point in the history
Fix for accuracy while using APC

---------

Co-authored-by: root <root@adobrzyniewicz-li4f-g2-mpijob-worker-0.adobrzyniewicz-li4f-g2-mpijob-worker.framework.svc.cluster.local>
  • Loading branch information
adobrzyn and root authored Mar 10, 2025
1 parent 88dc1de commit 489a526
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion vllm/worker/hpu_model_runner.py
Original file line number Diff line number Diff line change
Expand Up @@ -279,7 +279,8 @@ def _compile_region(self, model, name, module):
def _set_attn_bias(self, attn_metadata, batch_size, seq_len, device,
dtype):
if (attn_metadata is None
or (self.prefill_use_fusedsdpa and self.is_causal)
or (self.prefill_use_fusedsdpa and self.is_causal
and attn_metadata.block_list is None)
or not attn_metadata.is_prompt):
return attn_metadata

Expand Down

0 comments on commit 489a526

Please sign in to comment.