PPTAgent Arbitrary Code Execution Vulnerability via Python eval()

Vulnerability

A vulnerability allowing arbitrary code execution has been identified in PPTAgent, an agentic framework for reflective PowerPoint generation. This issue affects versions prior to commit 418491a, where the application improperly handled LLM-generated code by passing it to Python's eval() function with builtins accessible. The vulnerability arises in the 'execute_actions' method of the 'apis.py' file, where user-influenced slide editing commands can be executed as arbitrary code. The flaw has been patched in version 1.1.37.

Impact

Exploitation of this vulnerability allows for arbitrary code execution on the host system, with the potential for full system compromise. An attacker could execute shell commands, leading to a complete takeover of the host environment or container. Additionally, such exploitation could facilitate data exfiltration, allowing sensitive information to be read and sent to an external server.

Reproduction

To reproduce this vulnerability, first, generate a PowerPoint presentation using PPTAgent. Then, inject a prompt that influences the LLM to create a slide editing action, such as 'replace_image', which is a registered function. During the 'execute_actions' process, the LLM-generated command is evaluated with builtins in scope, allowing for the execution of arbitrary code. For example, the injected command could use 'os.system' to execute a shell command, demonstrating the vulnerability.

Remediation

Users can update to PPTAgent version 1.1.37 or later, where this vulnerability has been fixed. The patch involves modifying the 'execute_actions' method to use a safe evaluation context that excludes access to built-in functions.

Added: May 4, 2026, 5:26 PM
Updated: May 4, 2026, 5:26 PM

Vulnerability Rating

Custom Algorithm
spread
0.0
impact
10.0
exploitability
7.1
remediation
0.0
relevance
7.4
threat
4.8
urgency
2.9
incentive
0.0

Our algorithm analyzes dozens of metrics to generate these 8 key vulnerability categories, which are then combined to calculate the overall risk score.