[PATCH v2 1/4] drm/xe/ct: prevent UAF in send_recv()

Nilawar, Badal badal.nilawar at intel.com
Tue Oct 1 13:22:20 UTC 2024



On 01-10-2024 14:13, Matthew Auld wrote:
> Ensure we serialize with completion side to prevent UAF with fence going
> out of scope on the stack, since we have no clue if it will fire after
> the timeout before we can erase from the xa. Also we have some dependent
> loads and stores for which we need the correct ordering, and we lack the
> needed barriers. Fix this by grabbing the ct->lock after the wait, which
> is also held by the completion side.
> 
> v2 (Badal):
>   - Also print done after acquiring the lock and seeing timeout.
> 
> Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
> Signed-off-by: Matthew Auld <matthew.auld at intel.com>
> Cc: Matthew Brost <matthew.brost at intel.com>
> Cc: Badal Nilawar <badal.nilawar at intel.com>
> Cc: <stable at vger.kernel.org> # v6.8+
> ---
>   drivers/gpu/drm/xe/xe_guc_ct.c | 21 ++++++++++++++++++---
>   1 file changed, 18 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_guc_ct.c b/drivers/gpu/drm/xe/xe_guc_ct.c
> index 4b95f75b1546..44263b3cd8c7 100644
> --- a/drivers/gpu/drm/xe/xe_guc_ct.c
> +++ b/drivers/gpu/drm/xe/xe_guc_ct.c
> @@ -903,16 +903,26 @@ static int guc_ct_send_recv(struct xe_guc_ct *ct, const u32 *action, u32 len,
>   	}
>   
>   	ret = wait_event_timeout(ct->g2h_fence_wq, g2h_fence.done, HZ);
> +
> +	/*
> +	 * Ensure we serialize with completion side to prevent UAF with fence going out of scope on
> +	 * the stack, since we have no clue if it will fire after the timeout before we can erase
> +	 * from the xa. Also we have some dependent loads and stores below for which we need the
> +	 * correct ordering, and we lack the needed barriers.
> +	 */
> +	mutex_lock(&ct->lock);
>   	if (!ret) {
> -		xe_gt_err(gt, "Timed out wait for G2H, fence %u, action %04x",
> -			  g2h_fence.seqno, action[0]);
> +		xe_gt_err(gt, "Timed out wait for G2H, fence %u, action %04x, done %s",
> +			  g2h_fence.seqno, action[0], str_yes_no(g2h_fence.done));
>   		xa_erase_irq(&ct->fence_lookup, g2h_fence.seqno);
> +		mutex_unlock(&ct->lock);
>   		return -ETIME;
>   	}
>   
>   	if (g2h_fence.retry) {
>   		xe_gt_dbg(gt, "H2G action %#x retrying: reason %#x\n",
>   			  action[0], g2h_fence.reason);
> +		mutex_unlock(&ct->lock);
>   		goto retry;
>   	}
>   	if (g2h_fence.fail) {
> @@ -921,7 +931,12 @@ static int guc_ct_send_recv(struct xe_guc_ct *ct, const u32 *action, u32 len,
>   		ret = -EIO;
>   	}
>   
> -	return ret > 0 ? response_buffer ? g2h_fence.response_len : g2h_fence.response_data : ret;
> +	if (ret > 0)
> +		ret = response_buffer ? g2h_fence.response_len : g2h_fence.response_data;
> +
> +	mutex_unlock(&ct->lock);
> +
> +	return ret;
>   }

Reviewed-by: Badal Nilawar <badal.nilawar at intel.com>

Regards,
Badal

>   
>   /**



More information about the Intel-xe mailing list