XU-YIJIE grpo-flat Deserialization Vulnerability in torch.load Function Allows Arbitrary Code Execution

Vulnerability

A deserialization vulnerability has been identified in XU-YIJIE grpo-flat, specifically in the function main of the file grpo_vanilla.py. This vulnerability allows for the execution of arbitrary code by manipulating the deserialization process of untrusted data. The issue arises because the torch.load function is used without the weights_only=True parameter, leaving it open to execute malicious code embedded in the pickle data. Local access is required to exploit this vulnerability.

Impact

Exploitation of this vulnerability allows for arbitrary code execution on the system where the application is running.

Reproduction

To reproduce this vulnerability, create a malicious 'training_state.pt' file containing harmful Python code embedded within the pickle data. Replace the legitimate 'training_state.pt' file with this malicious one in the directory specified by model_name_or_path'. When the code resumes the training state and calls 'torch.load' to load the 'training_state.pt' file, the malicious code will be executed. This could be done, for example, by crafting a pickle file that, when loaded, executes a command or script that accesses sensitive information or system resources.

Added: Jun 9, 2025, 7:46 PM
Updated: Jun 9, 2025, 7:46 PM

Vulnerability Rating

Custom Algorithm
spread
0.0
impact
10.0
exploitability
4.6
remediation
0.0
relevance
0.0
threat
6.4
urgency
2.9
incentive
1.7

Our algorithm analyzes dozens of metrics to generate these 8 key vulnerability categories, which are then combined to calculate the overall risk score.