Primary

Context

Docugami Reader MD5 Hash Collision Vulnerability in Llama Index

Vulnerability

A hash collision vulnerability has been identified in the DocugamiReader class of the run-llama/llama_index repository, affecting versions prior to 0.12.28. The vulnerability arises from using MD5 hashing to generate IDs for document chunks, leading to collisions when structurally distinct chunks contain identical text. This flaw allows one chunk to overwrite another, causing the loss of semantically or legally important content, disrupting parent-child chunk hierarchies, and generating inaccurate or hallucinated responses in AI outputs.

Impact

Exploitation of this vulnerability causes hash collisions that allow document chunks with identical text to overwrite each other, leading to the loss of important content and disruption of chunk hierarchies. In the context of AI outputs, this can result in inaccurate or fabricated responses.

Remediation

Users can upgrade to version 0.3.1 to address this vulnerability.

Added: Jul 10, 2025, 1:42 PM

Updated: Jul 10, 2025, 1:42 PM

Vulnerability Rating

Custom Algorithm

spread

0.0

impact

2.5

exploitability

8.1

remediation

7.7

relevance

0.2

threat

3.2

urgency

2.9

incentive

5.8

Our algorithm analyzes dozens of metrics to generate these 8 key vulnerability categories, which are then combined to calculate the overall risk score.

Vulnerability Rating

Custom Algorithm

spread

0.0

impact

2.5

exploitability

8.1

remediation

7.7

relevance

0.2

threat

3.2

urgency

2.9

incentive

5.8

Our algorithm analyzes dozens of metrics to generate these 8 key vulnerability categories, which are then combined to calculate the overall risk score.

Docugami Reader MD5 Hash Collision Vulnerability in Llama Index

Vulnerability

Impact

Remediation

Affected Products

CVSS Scores

References

Vulnerability Rating

Vulnerability Rating