vLLM Denial-of-Service Vulnerability via Unbounded 'n' Parameter in OpenAI API Server

Vulnerability

A denial-of-service vulnerability has been identified in the vLLM OpenAI-compatible API server, affecting versions 0.1.0 prior to 0.19.0. The issue arises from the absence of upper bound validation on the 'n' parameter in the ChatCompletionRequest and CompletionRequest Pydantic models. This flaw allows an unauthenticated attacker to send a single HTTP request with an excessively large 'n' value, which disrupts the Python asyncio event loop and triggers immediate out-of-memory crashes. The vulnerability is caused by the engine allocating millions of copies of request objects in the heap, before the request even reaches the scheduling queue, effectively blocking the server from processing other requests.

Impact

Exploitation of this vulnerability leads to resource exhaustion, causing out-of-memory conditions on the host and blocking the server from processing other requests. This disruption is particularly impactful because it can be achieved with a single HTTP request, bypassing conventional bandwidth stress limitations and causing the operating system's OOM-killer to terminate the vLLM process.

Reproduction

The vulnerability can be reproduced by sending an HTTP request to the vLLM OpenAI-compatible API server with a 'n' parameter value that exceeds the default limit of 16384. This can be done using a tool like Postman or curl, by specifying a large 'n' value in the request body. Once the request is sent, the server will become unresponsive and eventually crash due to the out-of-memory condition.

Remediation

Users can update to vLLM version 0.19.0 or later, where this vulnerability has been fixed. For those using vLLM in a public-facing deployment, it is recommended to set the 'VLLM_MAX_N_SEQUENCES' environment variable to a value appropriate for the workload, such as 64 or 128, to limit the impact of a single request. Additionally, consider implementing request body validation and rate limiting at the reverse proxy layer to further mitigate the risk.

Added: Apr 6, 2026, 4:27 PM
Updated: Apr 6, 2026, 4:27 PM

Vulnerability Rating

Custom Algorithm
spread
2.6
impact
2.5
exploitability
8.8
remediation
8.3
relevance
5.4
threat
4.8
urgency
2.9
incentive
8.3

Our algorithm analyzes dozens of metrics to generate these 8 key vulnerability categories, which are then combined to calculate the overall risk score.