privsim mcp-test-runner Command Injection Vulnerability

Vulnerability

A command injection vulnerability has been identified in privsim mcp-test-runner version 0.2.0, specifically within the MCP 'run_tests' command. The vulnerability arises because the tool executes user-supplied command arguments via 'child_process.spawn' with 'shell: true', without proper validation or sanitization. This flaw allows an attacker with network access to the MCP interface to inject arbitrary shell commands, potentially leading to full host compromise, including unauthorized data access, modification of files or application state, and disruption of services. The vulnerability has been publicly disclosed and exploited.

Impact

Exploitation of this vulnerability allows for arbitrary command execution on the host system, with the potential for full compromise. Injected commands can be used to access, modify, or delete files, disrupt services, and execute commands with the same privileges as the MCP server process.

Reproduction

To reproduce this vulnerability, send a request to the MCP server's 'run_tests' tool with a non-generic framework value, such as 'jest', and a command that includes a payload, like 'id'. The MCP server must be running version 0.2.0, and the 'run_tests' tool should be accessible. After executing the command, the response will include the output from the injected command, confirming the successful exploitation.

Remediation

Users are advised not to expose the MCP server to untrusted clients until a fix is available. Access to the 'run_tests' tool should be restricted to trusted local users. Additionally, non-generic framework execution could be disabled or subjected to the same command validation applied to generic frameworks. Running the MCP server with a low-privilege OS account and a restricted working directory can also help mitigate the risk.

Added: May 4, 2026, 5:19 AM
Updated: May 4, 2026, 5:19 AM

Vulnerability Rating

Custom Algorithm
spread
0.0
impact
10.0
exploitability
8.0
remediation
0.0
relevance
7.4
threat
6.4
urgency
2.9
incentive
0.0

Our algorithm analyzes dozens of metrics to generate these 8 key vulnerability categories, which are then combined to calculate the overall risk score.