Foundation Agents MetaGPT
cpe:2.3:a:deepwisdom:metagpt:*:*:*:*:*:*:*
- <= 0.8.1
A remote code execution vulnerability exists in Foundation Agents MetaGPT versions through 0.8.1, specifically within the DataInterpreter component. The issue arises from inadequate input validation on user-provided prompts in the file 'metagpt/actions/di/write_analysis_code.py'. This vulnerability allows for prompt injection, where an attacker can manipulate the language model into generating malicious Python code that is automatically executed in a Jupyter Notebook environment, without any security checks or sandbox restrictions.
Exploitation of this vulnerability allows for arbitrary execution of Python code, including system commands, with no validation or user confirmation. This could lead to unauthorized access to sensitive files, exfiltration of environment variables, and establishment of persistent backdoors, potentially allowing lateral movement to other systems and complete compromise of the affected system.
The vulnerability can be reproduced by injecting a payload into the 'user_requirement' parameter, which is then processed by the DataInterpreter component. The injected prompt can include instructions for the language model to generate and execute Python code. Once the payload is executed, the injected code runs without any restrictions, allowing for arbitrary code execution on the host machine.
Short-term mitigations include implementing strict input validation on the 'user_requirement' parameter to remove common prompt injection patterns, establishing a code review mechanism that requires human approval before executing generated code, and restricting the use of the Terminal tool in the MetaGPT framework. Long-term solutions involve creating a sandboxed execution environment for running code, validating the safety of code generated by the language model, and engineering prompts to prevent injection attacks.
Our algorithm analyzes dozens of metrics to generate these 8 key vulnerability categories, which are then combined to calculate the overall risk score.