llama.cpp RPC Backend Remote Code Execution Vulnerability

Vulnerability

A critical remote code execution vulnerability has been identified in the llama.cpp RPC backend, affecting versions prior to b8492. The issue arises in the 'deserialize_tensor()' function, which fails to perform proper bounds validation when a tensor's buffer field is zero. This oversight allows an unauthenticated attacker to read and write arbitrary process memory by sending crafted GRAPH_COMPUTE messages. Exploitation of this vulnerability bypasses Address Space Layout Randomization (ASLR) and can be achieved with just TCP access to the RPC server port.

Impact

Exploitation of this vulnerability leads to arbitrary remote code execution on the server where the RPC backend is active, with the executed commands running as the user of the server process, often root in Docker environments.

Reproduction

The vulnerability can be reproduced by enabling the llama.cpp RPC backend and exposing it over the network. Once the server is running, a client can send GRAPH_COMPUTE messages that include tensors with a buffer value of zero. This will trigger the 'deserialize_tensor()' function, bypassing the necessary bounds checks and allowing for arbitrary memory read and write operations. After exploiting the vulnerability to execute a command, the server may crash during the cleanup process, but the command execution will still be successful.

Remediation

Users can update to llama.cpp version b8492 or later, where this vulnerability has been patched.

Added: Apr 1, 2026, 6:48 PM
Updated: Apr 1, 2026, 6:48 PM

Vulnerability Rating

Custom Algorithm
spread
0.0
impact
10.0
exploitability
7.7
remediation
0.0
relevance
5.1
threat
6.4
urgency
2.9
incentive
4.2

Our algorithm analyzes dozens of metrics to generate these 8 key vulnerability categories, which are then combined to calculate the overall risk score.