pypdf
cpe:2.3:a:pypdf_project:pypdf:*:*:*:*:*:*:*
- < 6.6.0
A performance vulnerability has been identified in the pypdf library, specifically in versions prior to 6.6.0. The issue arises in the non-strict reading mode when the library encounters malformed 'startxref' entries in PDF files. This can lead to excessively long processing times as the library attempts to rebuild the cross-reference table. The problem is exacerbated by PDF files containing a significant amount of whitespace, which can disrupt the parsing of object references. In strict mode, this vulnerability does not exist.
Exploitation of this vulnerability can cause significant delays in processing PDF files with malformed 'startxref' entries, particularly those with excessive whitespace. This can lead to performance degradation in applications that rely on the pypdf library for PDF manipulation or analysis.
The vulnerability can be reproduced by using the pypdf library to read a PDF file in non-strict mode that has a malformed 'startxref' entry. This can be done by creating a PDF file that includes an invalid 'startxref' reference, such as one that is improperly formatted or points to a non-existent object, and then adding a lot of whitespace characters. When the file is processed with pypdf, the library will struggle to handle the invalid reference, leading to a prolonged runtime.
Users can upgrade to pypdf version 6.6.0 or later, where this vulnerability has been fixed. For those who need to use an earlier version, switching to strict mode when reading PDF files can help avoid the issue. This can be done by passing the 'strict=True' argument to the PdfReader class.
Our algorithm analyzes dozens of metrics to generate these 8 key vulnerability categories, which are then combined to calculate the overall risk score.