vLLM Outlines Library Denial-of-Service Vulnerability via Unbounded Cache

Vulnerability

A denial-of-service vulnerability has been identified in vLLM, specifically in versions prior to 0.8.0. The issue arises in the outlines library, which is used by vLLM to support structured output or guided decoding. By default, outlines caches its compiled grammars on the local filesystem, a feature that has been enabled in vLLM. The vulnerability exists because the caching mechanism is unconditionally applied, allowing a malicious user to send a series of short decoding requests with unique schemas. Each request adds to the cache, potentially exhausting filesystem space and causing a denial-of-service condition. This issue is present when vLLM is accessed through the OpenAI compatible API server and applies only to the V0 engine, as the V1 engine is not affected.

Impact

Exploitation of this vulnerability can lead to a denial-of-service condition, causing the filesystem to run out of space.

Remediation

Users can upgrade to vLLM version 0.8.0 or later, where this vulnerability has been addressed. If the outlines cache is needed, it can be enabled by setting the 'VLLM_V0_USE_OUTLINES_CACHE' environment variable to '1'.

Added: Jun 9, 2025, 7:46 PM
Updated: Jun 9, 2025, 7:46 PM

Vulnerability Rating

Custom Algorithm
spread
2.6
impact
2.5
exploitability
5.5
remediation
8.3
relevance
0.0
threat
3.2
urgency
2.9
incentive
1.7

Our algorithm analyzes dozens of metrics to generate these 8 key vulnerability categories, which are then combined to calculate the overall risk score.