Linux kernel
cpe:2.3:a:linux:linux_kernel:*:*:*:*:*:*:*, +4 more
A vulnerability in the Linux kernel's RDMA subsystem, specifically within the mlx5 driver, has been addressed. The issue pertained to the recovery process of the User Memory Region (UMR) Queue Pair (QP), where tasks could become stalled. During recovery, it was crucial for the software to wait for all outstanding Work Requests (WRs) to finish before changing the QP state to RESET. Failure to do so could lead to the firmware omitting some error-laden flushed Completion Queue Entries (CQEs) and discarding them during the RESET, causing a race condition that resulted in lost CQEs and hung tasks. The recently applied patch rectifies this by sending a final WR as a barrier, ensuring all previous WRs have been acknowledged before safely resetting the QP state, thus restoring normal operation.
The vulnerability could cause tasks to become blocked for extended periods, disrupting normal operations. This was evidenced by system logs indicating tasks were stalled for over 120 seconds.
The vulnerability could be reproduced by initiating a recovery process on a UMR QP without first ensuring that all outstanding WRs had completed. This could be done by manually triggering a QP reset while WRs were still pending, leading to a blockage as the system awaited WR completions that were never received due to the firmware's handling of the reset process.
Users should apply the latest patches available in the Linux kernel to address this vulnerability.
Our algorithm analyzes dozens of metrics to generate these 8 key vulnerability categories, which are then combined to calculate the overall risk score.