vLLM Remote Code Execution Vulnerability in PyNcclPipe Integration

Vulnerability

A remote code execution vulnerability has been identified in vLLM, an inference and serving engine for large language models, specifically in versions 0.6.5 prior to 0.8.5. This vulnerability arises in environments using the PyNcclPipe KV cache transfer integration with the V0 engine. The issue stems from the PyNcclPipe service, which establishes a peer-to-peer communication domain for data transmission between distributed nodes. The vulnerability allows attackers to send malicious serialized data that is improperly deserialized, leading to remote code execution on the server.

Impact

Exploitation of this vulnerability allows for remote code execution on the vLLM host, with the executed code running under the user's privileges.

Reproduction

To reproduce this vulnerability, deploy a vLLM service using the PyNcclPipe KV cache transfer integration with the V0 engine. Configure the service to listen on all interfaces, which is the default behavior. Once the service is running, an attacker can exploit the vulnerability by sending crafted packets containing malicious payloads that exploit the unsafe deserialization process, leading to remote code execution.

Remediation

Users can upgrade to vLLM version 0.8.5 or later, where this vulnerability has been addressed. Instructions for updating vLLM can be found in the vLLM documentation.

Added: Jun 9, 2025, 7:46 PM
Updated: Jun 9, 2025, 7:46 PM

Vulnerability Rating

Custom Algorithm
spread
2.6
impact
10.0
exploitability
9.1
remediation
7.7
relevance
0.0
threat
6.4
urgency
2.9
incentive
10.0

Our algorithm analyzes dozens of metrics to generate these 8 key vulnerability categories, which are then combined to calculate the overall risk score.