[V1][Frontend] Improve Shutdown And Logs #11737

robertgshaw2-redhat · 2025-01-04T14:48:39Z

SUMMARY:

Prior to this PR, if we encountered an error in a background process, we kill the process tree immediately, which means that we cannot cleanup resources and cannot return good status codes to clients. This PR overhauls the Error handling to instead shut down the background processes and raise Errors that allow us to return proper HTTP status codes to users
Prior to this PR, we were not properly shutting down when Errors occured during startup, especially in the TP case
Prior to this PR, we used signals to catch errors from background processes. Due to limitations of Python, this prevented us from running outside the main thread. This is a problem for deployments in TritonServer

DESIGN:

for errors during startup, we wrap __init__ code with try...catch and push FAILED over the ready PIPE. This works well since the parent processes are waiting for confirmation
for errors during runtime, we wrap the busy loops with try..catch and push failure messages over the existing IPC mechanisms.

One weakness is that issues with the ipc mechanisms themselves are not handled explicitly

Curious if anyone has ideas on this
This can be a follow on task

TEST MATRIX:

AsyncLLM, TP=1 + TP>1 --- runtime and startup
LLM (MP), TP=1, TP>1 --- runtime and startup
LLM (no-MP), TP=1, TP>1 --- runtime and startup

Fixes: #12690

Signed-off-by: [email protected] <[email protected]>

… handle properly Signed-off-by: [email protected] <[email protected]>

github-actions · 2025-01-04T14:48:50Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can do one of these:

Add ready label to the PR
Enable auto-merge.

🚀

mergify · 2025-01-04T14:49:14Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @robertgshaw2-neuralmagic.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

vllm/entrypoints/openai/serving_completion.py

robertgshaw2-redhat · 2025-01-04T17:23:03Z

Here is what the server logs look like for:

TP=2, 1000 concurrent streaming requests
Simulate illegal memory access on RANK 1 after 200 steps of the engine

...
INFO:     127.0.0.1:45354 - "POST /v1/completions HTTP/1.1" 200 OK
INFO:     127.0.0.1:45360 - "POST /v1/completions HTTP/1.1" 200 OK
INFO:     127.0.0.1:45368 - "POST /v1/completions HTTP/1.1" 200 OK
INFO:     127.0.0.1:45372 - "POST /v1/completions HTTP/1.1" 200 OK
INFO:     127.0.0.1:45388 - "POST /v1/completions HTTP/1.1" 200 OK
INFO:     127.0.0.1:45394 - "POST /v1/completions HTTP/1.1" 200 OK
INFO 01-04 17:21:02 core.py:247] RUNNING: 306 | WAITING: 628
(VllmWorker rank=0 pid=1068781) ERROR 01-04 17:21:04 multiproc_executor.py:401] WorkerProc hit an exception: %s
(VllmWorker rank=0 pid=1068781) ERROR 01-04 17:21:04 multiproc_executor.py:401] Traceback (most recent call last):
(VllmWorker rank=0 pid=1068781) ERROR 01-04 17:21:04 multiproc_executor.py:401]   File "/home/rshaw/vllm/vllm/v1/executor/multiproc_executor.py", line 397, in worker_busy_loop
(VllmWorker rank=0 pid=1068781) ERROR 01-04 17:21:04 multiproc_executor.py:401]     output = getattr(self.worker, method)(*args, **kwargs)
(VllmWorker rank=0 pid=1068781) ERROR 01-04 17:21:04 multiproc_executor.py:401]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1068781) ERROR 01-04 17:21:04 multiproc_executor.py:401]   File "/home/rshaw/vllm/venv/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
(VllmWorker rank=0 pid=1068781) ERROR 01-04 17:21:04 multiproc_executor.py:401]     return func(*args, **kwargs)
(VllmWorker rank=0 pid=1068781) ERROR 01-04 17:21:04 multiproc_executor.py:401]            ^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1068781) ERROR 01-04 17:21:04 multiproc_executor.py:401]   File "/home/rshaw/vllm/vllm/v1/worker/gpu_worker.py", line 204, in execute_model
(VllmWorker rank=0 pid=1068781) ERROR 01-04 17:21:04 multiproc_executor.py:401]     output = self.model_runner.execute_model(scheduler_output)
(VllmWorker rank=0 pid=1068781) ERROR 01-04 17:21:04 multiproc_executor.py:401]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1068781) ERROR 01-04 17:21:04 multiproc_executor.py:401]   File "/home/rshaw/vllm/venv/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
(VllmWorker rank=0 pid=1068781) ERROR 01-04 17:21:04 multiproc_executor.py:401]     return func(*args, **kwargs)
(VllmWorker rank=0 pid=1068781) ERROR 01-04 17:21:04 multiproc_executor.py:401]            ^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1068781) ERROR 01-04 17:21:04 multiproc_executor.py:401]   File "/home/rshaw/vllm/vllm/v1/worker/gpu_model_runner.py", line 615, in execute_model
(VllmWorker rank=0 pid=1068781) ERROR 01-04 17:21:04 multiproc_executor.py:401]     hidden_states = self.model(
(VllmWorker rank=0 pid=1068781) ERROR 01-04 17:21:04 multiproc_executor.py:401]                     ^^^^^^^^^^^
(VllmWorker rank=0 pid=1068781) ERROR 01-04 17:21:04 multiproc_executor.py:401]   File "/home/rshaw/vllm/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
(VllmWorker rank=0 pid=1068781) ERROR 01-04 17:21:04 multiproc_executor.py:401]     return self._call_impl(*args, **kwargs)
(VllmWorker rank=0 pid=1068781) ERROR 01-04 17:21:04 multiproc_executor.py:401]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1068781) ERROR 01-04 17:21:04 multiproc_executor.py:401]   File "/home/rshaw/vllm/venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
(VllmWorker rank=0 pid=1068781) ERROR 01-04 17:21:04 multiproc_executor.py:401]     return forward_call(*args, **kwargs)
(VllmWorker rank=0 pid=1068781) ERROR 01-04 17:21:04 multiproc_executor.py:401]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorker rank=0 pid=1068781) ERROR 01-04 17:21:04 multiproc_executor.py:401]   File "/home/rshaw/vllm/vllm/model_executor/models/llama.py", line 571, in forward
(VllmWorker rank=0 pid=1068781) ERROR 01-04 17:21:04 multiproc_executor.py:401]     raise RuntimeError("ERROR IN LLAMA!")
(VllmWorker rank=0 pid=1068781) ERROR 01-04 17:21:04 multiproc_executor.py:401] RuntimeError: ERROR IN LLAMA!
ERROR 01-04 17:21:04 core.py:200] EngineCore hit an exception: Traceback (most recent call last):
ERROR 01-04 17:21:04 core.py:200]   File "/home/rshaw/vllm/vllm/v1/engine/core.py", line 193, in run_engine_core
ERROR 01-04 17:21:04 core.py:200]     engine_core.run_busy_loop()
ERROR 01-04 17:21:04 core.py:200]   File "/home/rshaw/vllm/vllm/v1/engine/core.py", line 231, in run_busy_loop
ERROR 01-04 17:21:04 core.py:200]     outputs = self.step()
ERROR 01-04 17:21:04 core.py:200]               ^^^^^^^^^^^
ERROR 01-04 17:21:04 core.py:200]   File "/home/rshaw/vllm/vllm/v1/engine/core.py", line 124, in step
ERROR 01-04 17:21:04 core.py:200]     output = self.model_executor.execute_model(scheduler_output)
ERROR 01-04 17:21:04 core.py:200]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 01-04 17:21:04 core.py:200]   File "/home/rshaw/vllm/vllm/v1/executor/multiproc_executor.py", line 167, in execute_model
ERROR 01-04 17:21:04 core.py:200]     model_output = self.collective_rpc("execute_model",
ERROR 01-04 17:21:04 core.py:200]                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 01-04 17:21:04 core.py:200]   File "/home/rshaw/vllm/vllm/v1/executor/multiproc_executor.py", line 161, in collective_rpc
ERROR 01-04 17:21:04 core.py:200]     raise e
ERROR 01-04 17:21:04 core.py:200]   File "/home/rshaw/vllm/vllm/v1/executor/multiproc_executor.py", line 150, in collective_rpc
ERROR 01-04 17:21:04 core.py:200]     raise result
ERROR 01-04 17:21:04 core.py:200] RuntimeError: ERROR IN LLAMA!
ERROR 01-04 17:21:04 core.py:200] 
CRITICAL 01-04 17:21:04 async_llm.py:65] AsyncLLM got fatal signal from worker process, shutting down. See stack trace for root cause.
CRITICAL 01-04 17:21:05 launcher.py:91] Engine failed, terminating server.
INFO:     Shutting down
INFO:     Waiting for application shutdown.
INFO:     Application shutdown complete.
INFO:     Finished server process [1067793]

Signed-off-by: [email protected] <[email protected]>

robertgshaw2-redhat · 2025-02-08T20:24:59Z

vllm/v1/engine/core.py

@@ -195,20 +209,12 @@ def signal_handler(signum, frame):
        signal.signal(signal.SIGTERM, signal_handler)
        signal.signal(signal.SIGINT, signal_handler)

-        parent_process = psutil.Process().parent()
-        engine_core = None
+        engine_core = EngineCoreProc(*args, **kwargs)


Note for reviewer:

EngineCoreProc sends a FAILED message over the ready_pipe if issue during startup

This is why it is outside of the try ... catch block

would be good to leave this in a comment in the code

Signed-off-by: [email protected] <[email protected]>

tests/v1/shutdown/test_forward_error.py

DarkLight1337 · 2025-02-09T04:28:39Z

Thanks for the improvement and writing these tests to verify this! I don't have issues with this PR as long as those tests pass.

DarkLight1337 · 2025-02-09T04:29:57Z

I'll double check the format of the error logs after the weekend.

Co-authored-by: Cyrus Leung <[email protected]>

mergify · 2025-02-10T15:32:09Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @robertgshaw2-redhat.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: [email protected] <[email protected]>

russellb

just some test comments for now

russellb · 2025-02-10T19:44:57Z

.buildkite/test-pipeline.yaml

@@ -187,6 +187,7 @@ steps:
  commands:
    # split the test to avoid interference
    - VLLM_USE_V1=1 pytest -v -s v1/core
+    - VLLM_USE_V1=1 pytest -v -s v1/shutdown


just sort of a side note, but it seems like updating these commands would be really easy to miss when adding new tests in a new directory.

tests/v1/shutdown/test_forward_error.py

russellb · 2025-02-10T19:49:37Z

tests/v1/shutdown/test_forward_error.py

+        # Monkeypatch an error in the model.
+        monkeypatch.setattr(LlamaForCausalLM, "forward", evil_forward)


Since you created a monkeypatch.context(), did you mean to use m.setattr(...) here?

From reading the docs, it sounds like both will result in the same behavior, though I would find m.setattr(...) a bit more clear that the change is limited to this context.

Suggested change

# Monkeypatch an error in the model.

monkeypatch.setattr(LlamaForCausalLM, "forward", evil_forward)

# Monkeypatch an error in the model.

m.setattr(LlamaForCausalLM, "forward", evil_forward)

tests/v1/shutdown/test_forward_error.py

russellb · 2025-02-10T20:00:51Z

tests/v1/shutdown/test_forward_error.py

+        # Confirm all the processes are cleaned up.
+        wait_for_gpu_memory_to_clear(
+            devices=list(range(tensor_parallel_size)),
+            threshold_bytes=2 * 2**30,


This number kind of looks like magic. It would be great to get it put in a constant with a comment explaining it somewhere.

We do have a GiB_bytes defined in vllm.utils, you can consider using that.

russellb · 2025-02-10T20:42:05Z

tests/v1/shutdown/test_startup_error.py

+
+MODELS = [
+    "meta-llama/Llama-3.2-1B",  # Raises on first fwd pass.
+    "mistralai/Mixtral-8x22B-Instruct-v0.1"  # Causes OOM.


Is this going to download the model and get a real OOM? From a quick look, it doesn't look like this is used elsewhere, so that'd be a net-new model to download during tests? If so, that doesn't seem worth the cost, especially given how unreliable HF has been in CI lately. Maybe I'm misunderstanding, though!

Good idea.

I do think that it is important to flex both the cases here since there is a subtle difference:

"meta-llama/Llama-3.2-1B", # Raises on first fwd pass. happens during the profiling (after IPC mechanisms are created)

"mistralai/Mixtral-8x22B-Instruct-v0.1" # Causes OOM. happens during the weight loading (before IPC mechanisms are setup)

I will instead do a monkeypatch to raise an error on load_weights for case

monkeypatch sounds good if the error encountered is clear enough!

russellb · 2025-02-10T20:55:00Z

vllm/entrypoints/launcher.py

+        terminate_if_errored(
+            server=server,
+            engine=request.app.state.engine_client,
+        )

        return Response(status_code=HTTPStatus.INTERNAL_SERVER_ERROR)


I just want to check an assumption here -- by responding with a 500 / Internal Server Error, we're saying that every case that reaches this code is an unexpected error from something being broken. It should never be something like the input from the user request was invalid, as that should get a different response.

Hopefully all of those cases are caught sooner, but I just wanted to check.

Correct.

EngineDeadError is an unrecoverable error in EngineCore (the best example is illegal memory access)

EngineGenerateError is a recoverable error that is request specific (e.g. an invalid request)

Looking at this again, we should consider removing the RuntimeError from this for V1.

Thanks.

Do you know where RuntimeError would come from? It does seem rather broad and generic compared to the other stuff we're handling here.

@russellb that was me: https://github.com/vllm-project/vllm/pull/6594/files

Back in the good old days of the V0 AsyncLLMEngine, the error handling was much less comprehensive and we could end up with RuntimeErrors raised to the frontend. Often the engine was dead in this case, but there was no handler in the frontend to catch that and kill the server immediately.

I think the V0 engine is robust enough these days that we could consider removing the RuntimeError check entirely. I do like the safety of having a broader catch and log so we know what's going on when some error case slips through the cracks though.

vllm/v1/engine/core.py

russellb · 2025-02-10T21:00:51Z

vllm/v1/engine/core.py

@@ -195,20 +209,12 @@ def signal_handler(signum, frame):
        signal.signal(signal.SIGTERM, signal_handler)
        signal.signal(signal.SIGINT, signal_handler)

-        parent_process = psutil.Process().parent()
-        engine_core = None
+        engine_core = EngineCoreProc(*args, **kwargs)


would be good to leave this in a comment in the code

russellb · 2025-02-10T21:03:58Z

vllm/v1/engine/core.py

-        except SystemExit:
-            logger.debug("EngineCore interrupted.")


Is it intentional that you don't want to catch SystemExit anymore?

Originally when I wrote this, I thought that SystemExit would be caught by Exception, so I wanted different behavior.

But in the intervening time I learn that the difference between Exception and BaseException is that Exception does not catch SystemExit or asyncio.CancelledError and so this is not needed here

Im happy to leave this in, though, if it makes it clearer that we have 2 cases

I don't think I'm understanding clearly, sorry!

You've pointed out that SystemExit does NOT derive from Exception, but the change seems to reflect the opposite, dropping it assuming Exception will cover it.

The proposed change makes sense if you intentionally do NOT want to catch and handle SystemExit here at all. Is that the intention?

vllm/v1/engine/core.py

russellb · 2025-02-10T21:14:37Z

vllm/v1/engine/core_client.py

+        except Exception as e:
+            raise self._format_exception(e) from None


You called this out in the PR description - I think it'd be worth filling an issue to help us remember to come back and look at zmq error handling. Depending on the error, our handling should differ. For example, a ZMQError exception with an errno of EAGAIN means we should try again as opposed to reporting an error.

russellb · 2025-02-10T21:18:21Z

vllm/v1/executor/multiproc_executor.py

-        except SystemExit:
-            logger.debug("Worker interrupted.")


are you intentionally not catching this anymore?

see other comment about SystemExit

Co-authored-by: Russell Bryant <[email protected]>

mergify · 2025-02-11T15:14:52Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @robertgshaw2-redhat.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

DarkLight1337 · 2025-02-12T10:57:33Z

vllm/v1/engine/core.py

@@ -129,7 +128,8 @@ def step(self) -> EngineCoreOutputs:
        return engine_core_outputs

    def shutdown(self):
-        self.model_executor.shutdown()
+        if self.model_executor:
+            model_executor.shutdown()


Suggested change

model_executor.shutdown()

self.model_executor.shutdown()

DarkLight1337 · 2025-02-12T11:02:06Z

vllm/v1/engine/exceptions.py

+    """Raised when the EngineCore dies. Unrecoverable."""
+
+    def __init__(self, *args, suppress_context: bool = False, **kwargs):
+        ENGINE_DEAD_MESSAGE = "EngineCore encountered an issue. See stack trace for the root cause."  # noqa: E501


Suggested change

ENGINE_DEAD_MESSAGE = "EngineCore encountered an issue. See stack trace for the root cause." # noqa: E501

ENGINE_DEAD_MESSAGE = "EngineCore encountered an issue. See stack trace (above) for the root cause." # noqa: E501

I'm happy with the current error format. Only change I'd suggest is to explicitly tell users to look above the final error message for the full stack trace.

robertgshaw2-redhat added 3 commits January 3, 2025 23:11

checkpoint prototype

eb16239

Signed-off-by: [email protected] <[email protected]>

Issue currently is with streaming. The HTTP exception handlers do not…

8549fdd

… handle properly Signed-off-by: [email protected] <[email protected]>

switch from ValueError -> Exception.

77801cd

mergify bot added the frontend label Jan 4, 2025

mergify bot added the needs-rebase label Jan 4, 2025

robertgshaw2-redhat commented Jan 4, 2025

View reviewed changes

vllm/entrypoints/openai/serving_completion.py Outdated Show resolved Hide resolved

robertgshaw2-redhat added 2 commits January 4, 2025 14:54

merged

1bbc3a4

updated

8eca864

mergify bot removed the needs-rebase label Jan 4, 2025

robertgshaw2-redhat added 3 commits January 4, 2025 15:09

stash

b8c77b3

stash

ce9b8ef

add watchdog

3a760a7

robertgshaw2-redhat marked this pull request as ready for review January 4, 2025 16:29

robertgshaw2-redhat requested review from WoosukKwon, njhill, ywang96, comaniac and alexm-redhat as code owners January 4, 2025 16:29

robertgshaw2-redhat changed the title ~~[Frontend] Improve API Server Error Messages~~ [Frontend] Improve API Server Error Logs Jan 4, 2025

robertgshaw2-redhat changed the title ~~[Frontend] Improve API Server Error Logs~~ [V1][Frontend] Improve Error Handling Shutdown And Logs Jan 4, 2025

robertgshaw2-redhat added 7 commits January 4, 2025 16:41

updated

3024da0

revert spurious changes

5af8189

updated

3cb21bb

updated

7c97308

updated

ea6824a

remove cruft

b278065

cruft

c004bd4

robertgshaw2-redhat added 3 commits February 8, 2025 18:58

updated

7b48b87

Signed-off-by: [email protected] <[email protected]>

updated

7400852

Signed-off-by: [email protected] <[email protected]>

updated

80317a0

Signed-off-by: [email protected] <[email protected]>

robertgshaw2-redhat commented Feb 8, 2025

View reviewed changes

robertgshaw2-redhat added 3 commits February 8, 2025 20:28

nits

ca37960

Signed-off-by: [email protected] <[email protected]>

fix test for bunched streaming

2d41499

Signed-off-by: [email protected] <[email protected]>

tweak typing

4a39d39

Signed-off-by: [email protected] <[email protected]>

DarkLight1337 reviewed Feb 9, 2025

View reviewed changes

tests/v1/shutdown/test_forward_error.py Outdated Show resolved Hide resolved

Update tests/v1/shutdown/test_forward_error.py

43360f0

Co-authored-by: Cyrus Leung <[email protected]>

mergify bot added the needs-rebase label Feb 10, 2025

Merge branch 'main' into api-server-error-handling

4d0f44f

mergify bot removed the needs-rebase label Feb 10, 2025

pre commit

218d095

Signed-off-by: [email protected] <[email protected]>

russellb mentioned this pull request Feb 10, 2025

[Frontend] Improve Error Handling #11570

Open

russellb reviewed Feb 10, 2025

View reviewed changes

robertgshaw2-redhat and others added 4 commits February 10, 2025 17:38

Update tests/v1/shutdown/test_forward_error.py

c395634

Co-authored-by: Russell Bryant <[email protected]>

Update vllm/v1/engine/core.py

042c486

Co-authored-by: Russell Bryant <[email protected]>

Update vllm/v1/engine/core.py

b5a7b6f

Co-authored-by: Russell Bryant <[email protected]>

Update tests/v1/shutdown/test_forward_error.py

dab77cf

Co-authored-by: Russell Bryant <[email protected]>

mergify bot added the needs-rebase label Feb 11, 2025

DarkLight1337 reviewed Feb 12, 2025

View reviewed changes

DarkLight1337 mentioned this pull request Feb 25, 2025

[Bug]: meta-llama/Llama-3.2-90B-Vision-Instruct and Qwen/Qwen2-VL-72B-Instruct models fails with asyncio.exceptions.CancelledError when using wiki image URLs #10904

Open

1 task

robertgshaw2-redhat mentioned this pull request Feb 28, 2025

[Bug]: (vllm0.7.3 & v1) ValueError: signal only works in main thread of the main interpreter #14008

Open

1 task

rafvasq mentioned this pull request Feb 28, 2025

[V1][Frontend] Improve Shutdown And Logs #14048

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[V1][Frontend] Improve Shutdown And Logs #11737

[V1][Frontend] Improve Shutdown And Logs #11737

robertgshaw2-redhat commented Jan 4, 2025 •

edited by github-actions bot

Loading

github-actions bot commented Jan 4, 2025

mergify bot commented Jan 4, 2025

robertgshaw2-redhat commented Jan 4, 2025

robertgshaw2-redhat Feb 8, 2025

russellb Feb 10, 2025

DarkLight1337 commented Feb 9, 2025

DarkLight1337 commented Feb 9, 2025 •

edited

Loading

mergify bot commented Feb 10, 2025

russellb left a comment

russellb Feb 10, 2025

russellb Feb 10, 2025

russellb Feb 10, 2025

DarkLight1337 Feb 12, 2025

russellb Feb 10, 2025

robertgshaw2-redhat Feb 10, 2025

russellb Feb 11, 2025

russellb Feb 10, 2025

robertgshaw2-redhat Feb 10, 2025 •

edited

Loading

russellb Feb 11, 2025

joerunde Feb 12, 2025

russellb Feb 10, 2025

russellb Feb 10, 2025

robertgshaw2-redhat Feb 10, 2025

robertgshaw2-redhat Feb 10, 2025

russellb Feb 11, 2025

russellb Feb 10, 2025

russellb Feb 10, 2025

robertgshaw2-redhat Feb 10, 2025

mergify bot commented Feb 11, 2025

DarkLight1337 Feb 12, 2025

DarkLight1337 Feb 12, 2025

		# Monkeypatch an error in the model.
		monkeypatch.setattr(LlamaForCausalLM, "forward", evil_forward)

		except Exception as e:
		raise self._format_exception(e) from None

	ENGINE_DEAD_MESSAGE = "EngineCore encountered an issue. See stack trace for the root cause." # noqa: E501
	ENGINE_DEAD_MESSAGE = "EngineCore encountered an issue. See stack trace (above) for the root cause." # noqa: E501

[V1][Frontend] Improve Shutdown And Logs #11737

Are you sure you want to change the base?

[V1][Frontend] Improve Shutdown And Logs #11737

Conversation

robertgshaw2-redhat commented Jan 4, 2025 • edited by github-actions bot Loading

github-actions bot commented Jan 4, 2025

mergify bot commented Jan 4, 2025

robertgshaw2-redhat commented Jan 4, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

DarkLight1337 commented Feb 9, 2025

DarkLight1337 commented Feb 9, 2025 • edited Loading

mergify bot commented Feb 10, 2025

russellb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

robertgshaw2-redhat Feb 10, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mergify bot commented Feb 11, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

robertgshaw2-redhat commented Jan 4, 2025 •

edited by github-actions bot

Loading

DarkLight1337 commented Feb 9, 2025 •

edited

Loading

robertgshaw2-redhat Feb 10, 2025 •

edited

Loading