vLLM Remote Code Execution Vulnerability in Multi-Node Deployments Using the V0 Engine

Vulnerability

A remote code execution vulnerability exists in vLLM, an inference and serving engine for large language models, specifically in multi-node deployments using the V0 engine. In this configuration, vLLM employs ZeroMQ for inter-node communication. Secondary vLLM hosts connect to a primary host via a 'SUB' ZeroMQ socket, receiving data that is deserialized using 'pickle'. This deserialization process is unsafe, as it can be exploited to execute arbitrary code on the remote machine. The vulnerability serves as an escalation point; if the primary vLLM host is compromised, the other hosts in the deployment could also be at risk. Additionally, attackers could exploit this vulnerability through methods like ARP cache poisoning, redirecting traffic to a malicious endpoint that delivers a payload for code execution. This issue only affects vLLM versions 0.5.2 and later, and the V1 engine is not impacted.

Impact

Exploitation of this vulnerability allows for remote code execution on the affected machine.

Added: Jun 9, 2025, 7:46 PM
Updated: Jun 9, 2025, 7:46 PM

Vulnerability Rating

Custom Algorithm
spread
2.6
impact
7.5
exploitability
3.8
remediation
8.3
relevance
0.0
threat
3.2
urgency
2.9
incentive
1.7

Our algorithm analyzes dozens of metrics to generate these 8 key vulnerability categories, which are then combined to calculate the overall risk score.