dask
cpe:2.3:a:anaconda:dask:*:*:*:*:python:*:*
- <= 3.0
A resource consumption vulnerability has been identified in Dask versions through 3.0, specifically within the 'nunique_approx' function of the 'dask/dataframe/hyperloglog.py' file. This issue arises in the HyperLogLog (HLL) Handler component, where the handling of hash values leads to increased resource usage. The vulnerability can be exploited remotely, although it requires a high level of complexity and is considered difficult to execute.
Exploitation of this vulnerability causes unnecessary resource consumption, which could lead to performance degradation.
The vulnerability can be reproduced by using Dask's HyperLogLog-based approximate cardinality estimation feature. This involves hashing rows with 'pd.util.hash_pandas_object', which is deterministic but not designed to be collision-resistant. In environments where the hashed data includes attacker-controlled strings, it's possible to create a preselection of keys that concentrate in a specific partition, causing an imbalance that could degrade performance.
A pull request addressing this vulnerability is pending acceptance.
Our algorithm analyzes dozens of metrics to generate these 8 key vulnerability categories, which are then combined to calculate the overall risk score.