Giskard ChatWorkflow Remote Code Execution Vulnerability via Non-Sandboxed Jinja2 Environment

Vulnerability

A remote code execution vulnerability has been identified in the Giskard library, specifically in the chat workflow component, prior to versions 0.3.4 and 1.0.2b1. The issue arises because the ChatWorkflow.chat() method directly passes string arguments as Jinja2 templates to a non-sandboxed environment. This allows for full remote code execution through Jinja2 class traversal, particularly if user input is sent to the chat method. The vulnerability is rooted in the fact that the input is silently interpreted as a template rather than plain text, enabling exploitation by traversing class attributes to access and execute system commands.

Impact

Exploitation of this vulnerability allows for remote code execution on the server where the application is running, with the potential to execute system commands, read files, and access environment variables.

Reproduction

To reproduce this vulnerability, use a version of Giskard prior to the patched releases. Pass user input directly to the ChatWorkflow.chat() method. The input will be processed as a Jinja2 template by the non-sandboxed environment, allowing for class traversal and execution of arbitrary code. This can be demonstrated by sending a payload that exploits the Jinja2 template rendering to execute code on the server.

Remediation

Users can update to Giskard versions 0.3.4 or 1.0.2b1 to address this vulnerability. The update replaces the unsandboxed Jinja2 Environment with SandboxedEnvironment, which blocks attribute access to dunder methods and prevents class traversal, mitigating the risk of remote code execution.

Added: Mar 31, 2026, 4:02 PM
Updated: Mar 31, 2026, 4:02 PM

Vulnerability Rating

Custom Algorithm
spread
0.0
impact
10.0
exploitability
8.7
remediation
0.0
relevance
5.0
threat
6.4
urgency
2.9
incentive
4.2

Our algorithm analyzes dozens of metrics to generate these 8 key vulnerability categories, which are then combined to calculate the overall risk score.