Ollama Out-of-Bounds Read Vulnerability Allows Remote Process Memory Leak
Briefly

Ollama Out-of-Bounds Read Vulnerability Allows Remote Process Memory Leak
"Ollama before 0.17.1 contains a heap out-of-bounds read vulnerability in the GGUF model loader. The /api/create endpoint accepts an attacker-supplied GGUF file in which the declared tensor offset and size exceed the file's actual length; during quantization in fs/ggml/gguf.go and server/quantization.go (WriteTo()), the server reads past the allocated heap buffer."
"The problem, at its core, stems from Ollama's use of the unsafe package when creating a model from a GGUF file, specifically in a function named "WriteTo()," thereby making it possible to execute operations that bypass the memory safety guarantees of the programming language."
"In a hypothetical attack scenario, a bad actor can send a specially crafted GGUF file to an exposed Ollama server with the tensor's shape set to a very large number to trigger the out-of-bounds heap read during model creation using the /api/create endpoint. Successful exploitation of the vulnerability could leak sensitive data from the Ollama process memory."
"This may include environment variables, API keys, system prompts, and concurrent users' conversation data. This data can be exfiltrated by uploading the res"
Ollama contains a heap out-of-bounds read vulnerability in the GGUF model loader that can be exploited remotely without authentication. The flaw is tracked as CVE-2026-7482 with a CVSS score of 9.1 and is codenamed Bleeding Llama. The /api/create endpoint accepts attacker-supplied GGUF files where declared tensor offsets and sizes exceed the actual file length. During quantization, the server reads past an allocated heap buffer, enabling memory disclosure. The issue stems from use of the unsafe package when creating a model from a GGUF file, specifically in WriteTo(). A crafted GGUF file can trigger the out-of-bounds read and potentially expose environment variables, API keys, system prompts, and other users’ conversation data.
Read at The Hacker News
Unable to calculate read time
[
|
]