[Intel-xe] [PATCH 4/6] drm/xe/bo: Gracefully handle errors from ttm_bo_move_accel_cleanup().

Thomas Hellström thomas.hellstrom at linux.intel.com
Mon Jun 19 11:59:03 UTC 2023


On 6/16/23 22:36, Matthew Brost wrote:
> On Fri, Jun 16, 2023 at 11:55:02AM +0200, Thomas Hellström wrote:
>> The function ttm_bo_move_accel_cleanup() attempts to help pipeline a
>> move, and in doing so, needs memory allocations which may fail.
>>
>> Rather than failing in a state where the new resource may freed while
>> accessed by the copy engine, sync uninterruptible and do a failsafe
>> cleanup.
>>
>> Signed-off-by: Thomas Hellström <thomas.hellstrom at linux.intel.com>
>> ---
>>   drivers/gpu/drm/xe/xe_bo.c | 7 ++++++-
>>   1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
>> index 94dd11066f51..77d5c5710688 100644
>> --- a/drivers/gpu/drm/xe/xe_bo.c
>> +++ b/drivers/gpu/drm/xe/xe_bo.c
>> @@ -689,10 +689,15 @@ static int xe_bo_move(struct ttm_buffer_object *ttm_bo, bool evict,
>>   		if (!move_lacks_source) {
>>   			ret = ttm_bo_move_accel_cleanup(ttm_bo, fence, evict,
>>   							true, new_mem);
>> -		} else {
>> +			if (ret)
>> +				dma_fence_wait(fence, false);
> Should it be:
> 	if (ret == -ENOMEM)
> 		dma_fence_wait(fence, false);

I think any error code should cause us to sync and reassign the new 
resource, but
also see below.


>
>> +		}
>> +
>> +		if (move_lacks_source || ret) {
> Should it be:
> 		if (move_lacks_source || ret == -ENOMEM) {

Same as above, but indeed we should take care of the error code above 
just after
dma_fence_wait() since dma_resv_add_fence() doesn't bail out early if 
the same fence was already added or if it is signaled.

Will respin,

Thanks,

Thomas


>
> Matt
>
>>   			dma_resv_add_fence(ttm_bo->base.resv, fence,
>>   					   DMA_RESV_USAGE_KERNEL);
>>   			ttm_bo_move_null(ttm_bo, new_mem);
>> +			ret = 0;
>>   		}
>>   
>>   		dma_fence_put(fence);
>> -- 
>> 2.40.1
>>


More information about the Intel-xe mailing list