vLLM Denial-of-Service Vulnerability via Large HTTP Headers

Vulnerability

A denial-of-service vulnerability has been identified in vLLM, an inference and serving engine for large language models. This issue affects versions 0.1.0 prior to 0.10.1.1. The vulnerability can be triggered by sending a single HTTP GET request with an extremely large header to an HTTP endpoint. This leads to server memory exhaustion, causing potential crashes or unresponsiveness. The attack does not require authentication, making it exploitable by any remote user.

Impact

Exploitation of this vulnerability causes server memory exhaustion, which can lead to a crash or unresponsiveness of the server.

Reproduction

To reproduce this vulnerability, send an HTTP GET request to an endpoint handled by vLLM with a very large header. The 'X-Forwarded-For' header can be used for this purpose, by setting it to a value around 5.8 billion bytes. This can be done using tools like curl or Postman, or by writing a script that sends the request with the large header.

Remediation

Users can upgrade to vLLM version 0.10.1.1 or later, which includes default protections against this type of header abuse. Alternatively, a proxy can be used in front of vLLM to provide similar protection.

Added: Aug 21, 2025, 3:28 PM
Updated: Aug 21, 2025, 3:28 PM

Vulnerability Rating

Custom Algorithm
spread
2.6
impact
2.5
exploitability
8.8
remediation
7.7
relevance
0.4
threat
4.8
urgency
2.9
incentive
10.0

Our algorithm analyzes dozens of metrics to generate these 8 key vulnerability categories, which are then combined to calculate the overall risk score.