ScrapeGraphAI scrapegraph-ai Remote Code Execution Vulnerability

Vulnerability

A critical remote code execution vulnerability exists in ScrapeGraphAI versions through 1.74.0, specifically within the GenerateCodeNode component. The issue arises because the application uses a Large Language Model (LLM) to generate Python code for data extraction from web pages. This generated code is executed using Python's exec() function in a so-called 'sandbox' that actually exposes the full __builtins__ module, allowing unrestricted access to dangerous functions. An attacker can exploit this by embedding prompt injection payloads in the HTML of a controlled website. When a victim scrapes this page using ScrapeGraphAI, the injected payloads are executed, potentially leading to arbitrary code execution on the victim's machine.

Impact

Exploitation of this vulnerability allows for full remote code execution on the victim's machine, with the executed code running under the user's privileges. This could lead to a complete system compromise.

Reproduction

To reproduce this vulnerability, first host a malicious website containing prompt injection payloads in HTML comments. Then, use ScrapeGraphAI's CodeGeneratorGraph to scrape the injected page. The LLM will generate Python code based on the injected prompts, which will be executed via exec() with full access to the system.

Remediation

It is recommended to update ScrapeGraphAI to a version that addresses this vulnerability. If immediate updating is not possible, consider sanitizing HTML before passing it to the LLM to remove comments and other potentially harmful content.

Added: Apr 5, 2026, 2:18 AM
Updated: Apr 5, 2026, 2:18 AM

Vulnerability Rating

Custom Algorithm
spread
0.0
impact
7.5
exploitability
7.1
remediation
0.0
relevance
5.3
threat
6.4
urgency
2.9
incentive
0.0

Our algorithm analyzes dozens of metrics to generate these 8 key vulnerability categories, which are then combined to calculate the overall risk score.