[PATCH 1/2] drm/xe/ct: prevent UAF in send_recv()

Tue Oct 1 06:48:40 UTC 2024

On 01/10/2024 06:22, Nilawar, Badal wrote:
> Hi Matthew,
> 
> On 30-09-2024 17:59, Matthew Auld wrote:
>> Ensure we serialize with completion side to prevent UAF with fence going
>> out of scope on the stack, since we have no clue if it will fire after
>> the timeout before we can erase from the xa. Also we have some dependent
>> loads and stores for which we need the correct ordering, and we lack the
>> needed barriers. Fix this by grabbing the ct->lock after the wait, which
>> is also held by the completion side.
>>
>> Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
>> Signed-off-by: Matthew Auld <matthew.auld at intel.com>
>> Cc: Matthew Brost <matthew.brost at intel.com>
>> Cc: Badal Nilawar <badal.nilawar at intel.com>
>> Cc: <stable at vger.kernel.org> # v6.8+
>> ---
>>   drivers/gpu/drm/xe/xe_guc_ct.c | 17 ++++++++++++++++-
>>   1 file changed, 16 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_guc_ct.c 
>> b/drivers/gpu/drm/xe/xe_guc_ct.c
>> index 4b95f75b1546..232eb69bd8e4 100644
>> --- a/drivers/gpu/drm/xe/xe_guc_ct.c
>> +++ b/drivers/gpu/drm/xe/xe_guc_ct.c
>> @@ -903,16 +903,26 @@ static int guc_ct_send_recv(struct xe_guc_ct 
>> *ct, const u32 *action, u32 len,
>>       }
>>       ret = wait_event_timeout(ct->g2h_fence_wq, g2h_fence.done, HZ);
>> +
>> +    /*
>> +     * Ensure we serialize with completion side to prevent UAF with 
>> fence going out of scope on
>> +     * the stack, since we have no clue if it will fire after the 
>> timeout before we can erase
>> +     * from the xa. Also we have some dependent loads and stores 
>> below for which we need the
>> +     * correct ordering, and we lack the needed barriers.
>> +     */
> 
> Before acquiring lock it is still possible that fence will be fired. To 
> know it it would be good to print g2h_fence.done in error message below.

Ok, will add.

> 
> Regards,
> Badal
> 
>> +    mutex_lock(&ct->lock);
>>       if (!ret) {
>>           xe_gt_err(gt, "Timed out wait for G2H, fence %u, action %04x",
>>                 g2h_fence.seqno, action[0]);
>>           xa_erase_irq(&ct->fence_lookup, g2h_fence.seqno);
>> +        mutex_unlock(&ct->lock);
>>           return -ETIME;
>>       }
>>       if (g2h_fence.retry) {
>>           xe_gt_dbg(gt, "H2G action %#x retrying: reason %#x\n",
>>                 action[0], g2h_fence.reason);
>> +        mutex_unlock(&ct->lock);
>>           goto retry;
>>       }
>>       if (g2h_fence.fail) {
>> @@ -921,7 +931,12 @@ static int guc_ct_send_recv(struct xe_guc_ct *ct, 
>> const u32 *action, u32 len,
>>           ret = -EIO;
>>       }
>> -    return ret > 0 ? response_buffer ? g2h_fence.response_len : 
>> g2h_fence.response_data : ret;
>> +    if (ret > 0)
>> +        ret = response_buffer ? g2h_fence.response_len : 
>> g2h_fence.response_data;
>> +
>> +    mutex_unlock(&ct->lock);
>> +
>> +    return ret;
>>   }
>>   /**
>