vLLM
cpe:2.3:a:vllm:vllm:*:*:*:*:*:*:*
- >= 0.6.5, < 0.8.5
A remote code execution vulnerability has been identified in vLLM, an inference and serving engine for large language models, specifically in versions 0.6.5 prior to 0.8.5. This vulnerability arises in environments using the PyNcclPipe KV cache transfer integration with the V0 engine. The issue stems from the PyNcclPipe service, which establishes a peer-to-peer communication domain for data transmission between distributed nodes. The vulnerability allows attackers to send malicious serialized data that is improperly deserialized, leading to remote code execution on the server.
Exploitation of this vulnerability allows for remote code execution on the vLLM host, with the executed code running under the user's privileges.
To reproduce this vulnerability, deploy a vLLM service using the PyNcclPipe KV cache transfer integration with the V0 engine. Configure the service to listen on all interfaces, which is the default behavior. Once the service is running, an attacker can exploit the vulnerability by sending crafted packets containing malicious payloads that exploit the unsafe deserialization process, leading to remote code execution.
Users can upgrade to vLLM version 0.8.5 or later, where this vulnerability has been addressed. Instructions for updating vLLM can be found in the vLLM documentation.
Our algorithm analyzes dozens of metrics to generate these 8 key vulnerability categories, which are then combined to calculate the overall risk score.