llama.cpp Integer Overflow Vulnerability Leading to Heap Buffer Overflow and Remote Code Execution
Vulnerability
An integer overflow vulnerability has been identified in the `ggml_nbytes` function of llama.cpp, an inference library for various LLM models written in C/C++. This vulnerability, present in versions prior to b7824, allows attackers to craft GGUF files with specific tensor dimensions that bypass memory validation. The manipulated tensor dimensions cause the `ggml_nbytes` function to return a drastically reduced size, leading to a heap-based buffer overflow when the application processes the tensor. This memory corruption vulnerability opens the door to potential remote code execution.
Impact
Exploitation of this vulnerability causes a heap-based buffer overflow, leading to memory corruption and allowing for remote code execution.
Reproduction
To reproduce this vulnerability, create a GGUF file with a tensor of type `GGML_TYPE_F32` and dimensions that include `4398046511105` (which is `2^42 + 1`). When this file is loaded, the integer overflow occurs, causing the application to miscalculate the required memory size, leading to a buffer overflow when the tensor is processed.
Remediation
Users can update to llama.cpp version b7824 or later, where this vulnerability has been fixed.
Vulnerability Rating
Our algorithm analyzes dozens of metrics to generate these 8 key vulnerability categories, which are then combined to calculate the overall risk score.
