vLLM Regular Expression Denial-of-Service Vulnerability in Python Tool Parser

Vulnerability

A Regular Expression Denial-of-Service (ReDoS) vulnerability has been identified in vLLM, an inference and serving engine for large language models. This issue affects versions 0.6.4 prior to 0.9.0, specifically within the Python tool parser used for OpenAI integration. The vulnerability arises from a complex and nested regular expression employed for detecting tool calls, which can be exploited to cause significant performance degradation or service unavailability. The problematic regex pattern, characterized by multiple nested quantifiers and optional groups, is susceptible to catastrophic backtracking, leading to exponential time complexity during matching. As a result, an attacker can craft inputs that severely tax the server's CPU, causing the vLLM service to become unresponsive or crash. Additionally, this regex parsing can exhaust GPU memory resources, further destabilizing the service.

Impact

Exploitation of this vulnerability leads to a denial-of-service condition, where the vLLM service becomes unavailable due to excessive CPU usage from processing the malicious input. This can cause the server to freeze or crash, disrupting any ongoing tasks or processes. Moreover, the regex parsing can tie up significant memory resources, particularly GPU memory, which is crucial for model inference. This delayed release of memory can result in GPU memory exhaustion, reduced performance, and overall service instability.

Reproduction

The vulnerability can be reproduced by sending a request to an API endpoint that triggers the tool call parsing feature. Include a payload that exploits the nested quantifiers in the regular expression, such as a string that repeats the pattern of tool calls with varying argument formats. The server's response time can be measured to demonstrate the exponential slowdown caused by the crafted input.

Remediation

Users can update to vLLM version 0.9.0 or later, where this vulnerability has been patched. After updating, ensure that the application is functioning correctly and monitor for any issues related to tool call parsing.

Added: Jun 9, 2025, 7:46 PM
Updated: Jun 9, 2025, 7:46 PM

Vulnerability Rating

Custom Algorithm
spread
2.6
impact
2.5
exploitability
6.2
remediation
7.7
relevance
0.0
threat
6.4
urgency
2.9
incentive
1.7

Our algorithm analyzes dozens of metrics to generate these 8 key vulnerability categories, which are then combined to calculate the overall risk score.