MLflow Command Injection Vulnerability in Model Serving

Vulnerability

A command injection vulnerability has been identified in MLflow when serving models with the 'enable_mlserver' option enabled. This issue arises because the 'model_uri' is directly incorporated into a shell command executed via 'bash -c', without adequate sanitization. If the 'model_uri' includes shell metacharacters such as '$()', or backticks, it permits command substitution and the execution of commands controlled by an attacker. This vulnerability is present in the latest version of MLflow and could lead to privilege escalation if a service with higher privileges serves models from a directory that lower-privileged users can write to.

Impact

Exploitation of this vulnerability allows for local command injection, with the potential for privilege escalation. This occurs when a higher-privileged service serves models from a directory writable by lower-privileged users, enabling the execution of commands with the service user's privileges.

Reproduction

To reproduce this vulnerability, create a directory named 'model' followed by a command substitution payload, such as '$(date)', in a location writable by the user. Then, use the MLflow PyFuncBackend to serve a model from this directory with 'enable_mlserver' set to true. The command substitution will be executed, and any commands included in the payload will run with the privileges of the user executing the MLflow command.

Remediation

It is recommended to avoid using 'bash -c' for executing commands. Instead, use 'subprocess.Popen' with a list of arguments, which does not invoke a shell and therefore prevents command injection. If it is necessary to use 'bash -c', the 'model_uri' should be sanitized using 'shlex.quote' to escape any potentially dangerous characters.

Added: Mar 31, 2026, 3:40 PM
Updated: Mar 31, 2026, 3:40 PM

Vulnerability Rating

Custom Algorithm
spread
5.7
impact
10.0
exploitability
9.1
remediation
0.0
relevance
5.1
threat
6.4
urgency
2.9
incentive
8.3

Our algorithm analyzes dozens of metrics to generate these 8 key vulnerability categories, which are then combined to calculate the overall risk score.