Dask HLL Handler Resource Consumption Vulnerability

Vulnerability

A resource consumption vulnerability has been identified in Dask versions through 3.0, specifically within the 'nunique_approx' function of the 'dask/dataframe/hyperloglog.py' file. This issue arises in the HyperLogLog (HLL) Handler component, where the handling of hash values leads to increased resource usage. The vulnerability can be exploited remotely, although it requires a high level of complexity and is considered difficult to execute.

Impact

Exploitation of this vulnerability causes unnecessary resource consumption, which could lead to performance degradation.

Reproduction

The vulnerability can be reproduced by using Dask's HyperLogLog-based approximate cardinality estimation feature. This involves hashing rows with 'pd.util.hash_pandas_object', which is deterministic but not designed to be collision-resistant. In environments where the hashed data includes attacker-controlled strings, it's possible to create a preselection of keys that concentrate in a specific partition, causing an imbalance that could degrade performance.

Remediation

A pull request addressing this vulnerability is pending acceptance.

Added: Jun 3, 2026, 2:18 AM
Updated: Jun 3, 2026, 2:18 AM

Vulnerability Rating

Custom Algorithm
spread
5.7
impact
2.5
exploitability
8.4
remediation
0.0
relevance
9.9
threat
6.4
urgency
2.9
incentive
4.2

Our algorithm analyzes dozens of metrics to generate these 8 key vulnerability categories, which are then combined to calculate the overall risk score.