pdfminer.six Arbitrary Code Execution Vulnerability via Malicious PDF Files
Vulnerability
A vulnerability in pdfminer.six prior to version 20251107 allows arbitrary code execution through the deserialization of malicious pickle files embedded in PDF documents. The issue arises in the CMapDB._load_data() function, which uses pickle.loads() to deserialize files that are expected to be part of the pdfminer.six distribution. However, a crafted PDF can specify an alternative directory and filename, leading to the execution of arbitrary code when the PDF is processed.
Impact
Exploitation of this vulnerability allows for arbitrary code execution on the system processing the malicious PDF, with the executed code running under the permissions of the user or process handling the PDF.
Reproduction
To reproduce this vulnerability, create a PDF file that includes a reference to a malicious CMap entry. This entry should point to a zipped pickle file containing executable Python code, placed in a location accessible to the PDF processing environment. When the PDF is processed with pdfminer.six, the malicious code will be executed.
Remediation
Users should update to pdfminer.six version 20251107 or later, where this vulnerability has been fixed.
Vulnerability Rating
Our algorithm analyzes dozens of metrics to generate these 8 key vulnerability categories, which are then combined to calculate the overall risk score.
