ray
cpe:2.3:a:ray_project:ray:*:*:*:*:*:*:*
- >= 2.49.0, < 2.55.0
A remote code execution vulnerability exists in Ray versions 2.54.0 prior to 2.55.0. The issue arises in the Ray Data component, which registers custom Arrow extension types globally in PyArrow. When PyArrow processes a Parquet file containing these extension types, it invokes the `__arrow_ext_deserialize__` method on the field's metadata bytes. Ray's handling of this metadata directly passes the bytes to `cloudpickle.loads()`, leading to arbitrary code execution during schema parsing, before any actual row data is accessed. This vulnerability affects any process using Ray Data that reads Parquet files with the registered extension types.
Exploitation of this vulnerability allows for arbitrary code execution on the server, as the code is executed in the context of the Ray worker process user. This could lead to a full server compromise.
To reproduce this vulnerability, create a Parquet file that includes a column with one of the vulnerable Ray Data Arrow extension types: `ray.data.arrow_tensor`, `ray.data.arrow_tensor_v2`, or `ray.data.arrow_variable_shaped_tensor`. Then, use a Ray Data pipeline to read the Parquet file. The vulnerability can be triggered by the `__arrow_ext_deserialize__` method, which will execute the crafted `cloudpickle` payload before any row data is processed.
Users can upgrade to Ray version 2.55.0 or later, where this vulnerability has been patched.
Our algorithm analyzes dozens of metrics to generate these 8 key vulnerability categories, which are then combined to calculate the overall risk score.