ml-engineering Project Insecure Deserialization Vulnerability in PyTorch Checkpoint Processing Script
Vulnerability
A vulnerability allowing insecure deserialization has been identified in the torch-checkpoint-shrink.py script of the ml-engineering project, specifically in commit 0099885db36a8f06556efe1faf552518852cb1e0. This vulnerability arises because the script uses torch.load() to read PyTorch checkpoint files (.pt) without activating the security-focused weights_only=True parameter. As a result, arbitrary Python objects can be deserialized using the pickle module. A remote attacker could exploit this flaw by supplying a maliciously crafted checkpoint file, potentially leading to arbitrary code execution under the user's context who is running the script.
Impact
Exploitation of this vulnerability allows for arbitrary code execution in the context of the user running the script.
Reproduction
To reproduce this vulnerability, use the torch-checkpoint-shrink.py script without the weights_only=True parameter. The script will process .pt files in the specified checkpoint directory, allowing the deserialization of arbitrary objects from those files. If a checkpoint file containing maliciously crafted data is provided, the vulnerability can be exploited, leading to code execution.
Vulnerability Rating
Our algorithm analyzes dozens of metrics to generate these 8 key vulnerability categories, which are then combined to calculate the overall risk score.
