Hugging Face Transformers Regular Expression Denial-of-Service Vulnerability in GPT-NeoX-Japanese Model

Vulnerability

A Regular Expression Denial-of-Service (ReDoS) vulnerability exists in the Hugging Face Transformers library, specifically within the GPT-NeoX-Japanese model, version 4.48.1. The issue arises in the SubWordJapaneseTokenizer class, where certain regular expressions can be exploited with specially crafted inputs. The vulnerability is caused by a regex that exhibits exponential complexity under specific conditions, leading to excessive backtracking. This behavior can cause high CPU usage and potential application downtime, creating a Denial-of-Service scenario.

Impact

Exploitation of this vulnerability causes high CPU usage and can lead to application downtime, creating a Denial-of-Service condition.

Reproduction

The vulnerability can be reproduced by using the GPTNeoXJapaneseTokenizer from the Transformers library. After loading the tokenizer, an input string composed of a repeated pattern of '111111,' followed by '千ドル' can be processed. As the length of the repeated pattern increases, the execution time grows exponentially, demonstrating the vulnerability's impact on performance.

Remediation

Users can update to Hugging Face Transformers version 4.50.0 or later, where this vulnerability has been fixed.

Added: Jun 9, 2025, 7:46 PM
Updated: Jun 9, 2025, 7:46 PM

Vulnerability Rating

Custom Algorithm
spread
6.6
impact
0.6
exploitability
5.8
remediation
7.7
relevance
0.0
threat
6.4
urgency
2.9
incentive
1.7

Our algorithm analyzes dozens of metrics to generate these 8 key vulnerability categories, which are then combined to calculate the overall risk score.