PLY Remote Code Execution Vulnerability via Undocumented `picklefile` Parameter
Vulnerability
A remote code execution vulnerability has been identified in the PLY (Python Lex-Yacc) library, specifically in version 3.11 distributed through PyPI. The issue arises from an undocumented feature that allows the `yacc()` function to accept a `.pkl` file via the `picklefile` parameter. This file is deserialized using `pickle.load()` without any validation. Since the `pickle` module can execute embedded code during deserialization, an attacker can exploit this by crafting a malicious pickle file that executes arbitrary code when the parser is initialized. The vulnerability is particularly concerning because the `picklefile` parameter is not mentioned in the official documentation or the GitHub repository, yet it is active in the PyPI version. This creates a stealthy backdoor and a risk of persistence, especially in environments where parser tables are cached, shared, or loaded from user-controlled paths.
Impact
Exploitation of this vulnerability allows for arbitrary code execution on the host machine, executed during the initialization of the parser, before any parsing logic is applied. This could lead to the introduction of persistent backdoors, especially in environments that cache or share parser tables, or in CI/CD pipelines.
Reproduction
The vulnerability can be reproduced by creating a malicious pickle file that exploits the `picklefile` parameter in the `yacc()` function. This can be done by crafting a pickle payload that, when deserialized, executes a system command or performs an unwanted action, such as creating a file or modifying system state. Once the pickle file is prepared, it can be used to demonstrate the arbitrary code execution by invoking `yacc(picklefile='exploit.pkl')`, which will execute the embedded code during parser initialization.
Remediation
Users are advised not to use the `picklefile` parameter with untrusted or externally writable files, and to avoid loading parser tables from user-controlled locations. All pickle files should be treated as unsafe input, and it is recommended to regenerate parser tables instead of loading them from disk.
Vulnerability Rating
Our algorithm analyzes dozens of metrics to generate these 8 key vulnerability categories, which are then combined to calculate the overall risk score.
