llama.cpp Heap Buffer Overflow Vulnerability in GGUF File Parsing Bypasses Previous Fix

Vulnerability

A heap buffer overflow vulnerability has been identified in llama.cpp, specifically in the gguf_init_from_file_impl() function within gguf.cpp. This vulnerability, present in versions through b8145, arises from an integer overflow that leads to an undersized heap allocation. The subsequent fread() operation then writes over 528 bytes of attacker-controlled data past the allocated buffer, creating a potential for arbitrary code execution. This issue bypasses a similar vulnerability addressed in CVE-2025-53630, as the previous fix did not cover all relevant areas.

Impact

Exploitation of this vulnerability causes a heap buffer overflow, allowing for the injection of attacker-controlled data into memory. This overflow can be exploited to execute arbitrary code with the same privileges as the user running the application, potentially leading to a root shell on the system.

Reproduction

The vulnerability can be reproduced by creating a GGUF file that includes two I8 tensors. Both tensors should be crafted to make the context size calculation overflow, resulting in a memory allocation that is significantly smaller than needed. When this crafted GGUF file is loaded using the llama-gguf tool, the heap overflow occurs as the program reads more data than the allocated buffer can handle, overwriting adjacent memory and corrupting the heap.

Remediation

Users should update to llama.cpp version 0.0.0 or later, where this vulnerability has been fixed.

Added: Mar 12, 2026, 5:23 PM
Updated: Mar 12, 2026, 5:23 PM

Vulnerability Rating

Custom Algorithm
spread
0.0
impact
10.0
exploitability
7.7
remediation
0.0
relevance
3.8
threat
6.4
urgency
2.9
incentive
0.0

Our algorithm analyzes dozens of metrics to generate these 8 key vulnerability categories, which are then combined to calculate the overall risk score.