vLLM
cpe:2.3:a:vllm:vllm:*:*:*:*:*:*:*
- >= 0.6.5, < 0.8.0
A remote code execution vulnerability has been identified in vLLM, a high-throughput inference engine for large language models. This issue arises when vLLM is configured to use Mooncake, as it exposes unsafe deserialization over ZMQ/TCP on all network interfaces. Attackers can exploit this vulnerability to execute remote code on distributed hosts. The problem is rooted in the Mooncake Pipe, which, by design, allows public access to certain sockets over the network, facilitating the exploitation of deserialization vulnerabilities. This vulnerability affects vLLM versions 0.6.5 through 0.8.0.
Exploitation of this vulnerability allows for remote code execution on affected hosts.
To reproduce this vulnerability, deploy vLLM versions 0.6.5 through 0.8.0 with Mooncake integration enabled. The Mooncake Pipe will automatically expose a socket over ZMQ/TCP, accessible to arbitrary users. When an attacker sends a payload to this socket, the vLLM process will deserialize the data using Python's pickle module, executing any embedded code on the host.
This vulnerability has been fixed in vLLM version 0.8.0.
Our algorithm analyzes dozens of metrics to generate these 8 key vulnerability categories, which are then combined to calculate the overall risk score.