Docarray Prototype Pollution Vulnerability in Web API Component

Vulnerability

A critical prototype pollution vulnerability has been identified in Docarray versions through 0.40.1. The issue arises in the Web API component, specifically within the `__getitem__` method of the `torch_dataset.py` file. This vulnerability allows for improper modification of object prototype attributes, which can be exploited remotely. The manipulation involves accessing internal class objects through unsanitized dotted paths, potentially leading to denial-of-service conditions. Furthermore, when combined with certain backend code, this vulnerability could facilitate other attacks, such as remote code execution or cross-site scripting.

Impact

Exploitation of this vulnerability allows for class pollution, which can disrupt application functionality and lead to denial-of-service conditions. In the case of Docarray, this exploitation can be leveraged to cause a denial-of-service by manipulating the application's data model handling in FastAPI, according to the vulnerability disclosure.

Reproduction

The vulnerability can be reproduced by sending a POST request to the `/process_thesis/` endpoint of a FastAPI application that uses Docarray. The request must include a `preprocessing_paths` parameter that points to an internal class attribute, such as `thesis.__class__.__class__.__subclasscheck__`, and a preprocessing function that the application will apply. This will overwrite the `__subclasscheck__` method with a non-callable value, causing a denial-of-service condition on the server.

Remediation

It is recommended to update Docarray to a version that addresses this vulnerability. Users should also implement checks in the `__getitem__` method of the `MultiModalDataset` class to prevent unauthorized access to internal attributes.

Added: Sep 1, 2025, 7:22 PM
Updated: Sep 1, 2025, 7:22 PM

Vulnerability Rating

Custom Algorithm
spread
0.0
impact
5.0
exploitability
6.6
remediation
0.0
relevance
0.0
threat
6.4
urgency
2.9
incentive
1.7

Our algorithm analyzes dozens of metrics to generate these 8 key vulnerability categories, which are then combined to calculate the overall risk score.