run-llama/llama_index
cpe:2.3:a:llamaindex:llamaindex:*:*:*:*:*:*:*
- <= 0.12.22.post1
A vulnerability exists in the ArxivReader class of the run-llama/llama_index repository, in versions prior to 0.12.22.post1. This vulnerability allows for MD5 hash collisions when generating filenames for downloaded papers, leading to potential data loss. Papers with the same title but different content may overwrite each other, causing some papers to be missed during processing for AI model training.
Exploitation of this vulnerability can result in data loss, as papers may be overwritten and not processed for AI model training.
To reproduce this vulnerability, download papers using the ArxivReader class in a version of the llama_index repository prior to 0.12.22.post1. The MD5 hash collision can be observed when papers with identical titles but different contents are downloaded, leading to one paper overwriting the other.
Users can upgrade to llama_index version 0.12.28 or later, where this vulnerability has been fixed.
Our algorithm analyzes dozens of metrics to generate these 8 key vulnerability categories, which are then combined to calculate the overall risk score.