lxml_html_clean <base> Tag Injection Vulnerability Allowing URL Hijacking

Vulnerability

A vulnerability exists in lxml_html_clean versions prior to 0.4.4, where the default Cleaner configuration does not properly handle <base> tags. This oversight allows attackers to inject <base> tags and hijack relative links on the page. Although the Cleaner removes <html>, <head>, and <title> tags when page_structure=True, <base> tags are not addressed, creating a potential attack vector. The injected <base> tag can redirect all relative URLs to a domain controlled by the attacker, leading to phishing, cross-site scripting, or defacement attacks.

Impact

Exploitation of this vulnerability allows for the injection of <base> tags, which can hijack all relative URLs on the page. This could redirect links and form submissions to an attacker-controlled domain, steal credentials or sensitive data, load malicious JavaScript files, or facilitate UI redressing or defacement by manipulating image or stylesheet references.

Reproduction

To reproduce this vulnerability, use lxml_html_clean version 0.4.3 with the default Cleaner configuration. Inject a <base> tag into the HTML being cleaned. After processing, the <base> tag will be preserved, demonstrating that the vulnerability exists by allowing redirection of relative URLs to an attacker-controlled domain.

Remediation

Users can upgrade to lxml_html_clean version 0.4.4 or later, where this vulnerability has been patched.

Added: Mar 5, 2026, 8:19 PM
Updated: Mar 5, 2026, 8:19 PM

Vulnerability Rating

Custom Algorithm
spread
0.0
impact
1.7
exploitability
5.9
remediation
0.0
relevance
3.5
threat
6.4
urgency
2.9
incentive
0.0

Our algorithm analyzes dozens of metrics to generate these 8 key vulnerability categories, which are then combined to calculate the overall risk score.