Primary

Context

pypdf Startxref Handling Performance Vulnerability

Vulnerability

Patched

A performance vulnerability has been identified in the pypdf library, specifically in versions prior to 6.6.0. The issue arises in the non-strict reading mode when the library encounters malformed 'startxref' entries in PDF files. This can lead to excessively long processing times as the library attempts to rebuild the cross-reference table. The problem is exacerbated by PDF files containing a significant amount of whitespace, which can disrupt the parsing of object references. In strict mode, this vulnerability does not exist.

Impact

Exploitation of this vulnerability can cause significant delays in processing PDF files with malformed 'startxref' entries, particularly those with excessive whitespace. This can lead to performance degradation in applications that rely on the pypdf library for PDF manipulation or analysis.

Reproduction

The vulnerability can be reproduced by using the pypdf library to read a PDF file in non-strict mode that has a malformed 'startxref' entry. This can be done by creating a PDF file that includes an invalid 'startxref' reference, such as one that is improperly formatted or points to a non-existent object, and then adding a lot of whitespace characters. When the file is processed with pypdf, the library will struggle to handle the invalid reference, leading to a prolonged runtime.

Remediation

Users can upgrade to pypdf version 6.6.0 or later, where this vulnerability has been fixed. For those who need to use an earlier version, switching to strict mode when reading PDF files can help avoid the issue. This can be done by passing the 'strict=True' argument to the PdfReader class.

Added: Jan 10, 2026, 5:18 AM

Updated: Jan 10, 2026, 5:18 AM

Vulnerability Rating

Custom Algorithm

spread

5.4

impact

2.5

exploitability

5.7

remediation

8.3

relevance

2.0

threat

4.8

urgency

2.9

incentive

1.7

Our algorithm analyzes dozens of metrics to generate these 8 key vulnerability categories, which are then combined to calculate the overall risk score.

Vulnerability Rating

Custom Algorithm

spread

5.4

impact

2.5

exploitability

5.7

remediation

8.3

relevance

2.0

threat

4.8

urgency

2.9

incentive

1.7

Our algorithm analyzes dozens of metrics to generate these 8 key vulnerability categories, which are then combined to calculate the overall risk score.

pypdf Startxref Handling Performance Vulnerability

Vulnerability

Impact

Reproduction

Remediation

Affected Products

pypdf

CVSS Scores

References

Vulnerability Rating

Vulnerability Rating