LangChain HTMLHeaderTextSplitter SSRF Vulnerability in Text Splitter Component

Vulnerability

A server-side request forgery (SSRF) vulnerability has been identified in the LangChain framework, specifically in the 'langchain-text-splitters' package versions prior to 1.1.2. The issue arises in the 'HTMLHeaderTextSplitter.split_text_from_url()' method, which initially validates URLs but then fetches them with redirects enabled. This flaw allows an attacker to redirect to internal or cloud metadata endpoints, bypassing SSRF protections. The vulnerability could lead to data exfiltration if the application exposes the fetched Document contents back to the requester.

Impact

Exploitation of this vulnerability could allow an attacker to access internal endpoints or cloud metadata services, potentially leading to unauthorized data exposure. This is particularly concerning for applications that return Document contents to the requester, as sensitive data from internal sources could be leaked.

Reproduction

To reproduce this vulnerability, pass a URL controlled by an attacker to the 'split_text_from_url()' method of the 'HTMLHeaderTextSplitter' class. The URL must first pass the 'validate_safe_url()' check. Once the URL is fetched, the 'requests.get()' method will follow any redirects to internal endpoints, taking advantage of the fact that redirect targets are not revalidated.

Remediation

Users are advised to update to 'langchain-text-splitters' version 1.1.2 or later. The fixed version requires 'langchain-core' version 1.2.31 or later. Additionally, 'split_text_from_url()' has been deprecated; users should manually fetch HTML content and pass it to the 'split_text()' method.

Added: Apr 24, 2026, 9:30 PM
Updated: Apr 24, 2026, 9:30 PM

Vulnerability Rating

Custom Algorithm
spread
0.0
impact
2.5
exploitability
6.0
remediation
0.0
relevance
6.6
threat
1.6
urgency
2.9
incentive
0.0

Our algorithm analyzes dozens of metrics to generate these 8 key vulnerability categories, which are then combined to calculate the overall risk score.