[Intel-xe] [PATCH 5/5] drm/xe: Return the correct error when dma_resv_wait_timeout fails

Maarten Lankhorst maarten.lankhorst at linux.intel.com
Mon May 29 15:21:12 UTC 2023


On 2023-05-27 07:17, Christopher Snowhill wrote:
> On Fri, May 26, 2023 at 12:16 PM Souza, Jose <jose.souza at intel.com> wrote:
>> On Fri, 2023-05-26 at 14:11 +0200, Maarten Lankhorst wrote:
>>> We call dma_resv_wait_timeout with MAX_SCHEDULE_TIMEOUT, so it can
>>> never return -ETIME. It will however fail if interrupted, so in that
>>> case return the error.
>>>
>>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst at linux.intel.com>
>>> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/239
>>> ---
>>>  drivers/gpu/drm/xe/xe_bo.c | 4 ++--
>>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
>>> index 8735facb1cf9..77ba8492bd90 100644
>>> --- a/drivers/gpu/drm/xe/xe_bo.c
>>> +++ b/drivers/gpu/drm/xe/xe_bo.c
>>> @@ -611,8 +611,8 @@ static int xe_bo_move(struct ttm_buffer_object *ttm_bo, bool evict,
>>>                                                    DMA_RESV_USAGE_BOOKKEEP,
>>>                                                    true,
>>>                                                    MAX_SCHEDULE_TIMEOUT);
>>> -             if (timeout <= 0) {
>>> -                     ret = -ETIME;
>>> +             if (timeout < 0) {
>>> +                     ret = timeout;
>>>                       goto out;
>> 0 means timeout, so what this is doing is allowing a error to be treated a success.
>> I understanding that is should never happen with MAX_SCHEDULE_TIMEOUT but I would rather leave this as "<=" just in case there is a bug in
>> dma_resv_wait_timeout() that ignores the MAX_SCHEDULE_TIMEOUT.
> If 0 means timeout, and < 0 means other error, then perhaps it should
> return -ETIME for 0, otherwise pass on the error?

No other code tests 0 for MAX_SCHEDULE_TIMEOUT, it's LONG_MAX/HZ. So even on 32-bits, it would take 12.4 days.

Assuming a timeout happens after that time, I think it's safe to just continue..

On 64-bits, I think the system has to move between solar systems first.

~Maarten



More information about the Intel-xe mailing list