[PATCH v2 1/3] drm/xe/userptr: restore invalidation list on error

Matthew Auld matthew.auld at intel.com
Mon Feb 17 09:38:26 UTC 2025


On 15/02/2025 01:28, Matthew Brost wrote:
> On Fri, Feb 14, 2025 at 05:05:28PM +0000, Matthew Auld wrote:
>> On error restore anything still on the pin_list back to the invalidation
>> list on error. For the actual pin, so long as the vma is tracked on
>> either list it should get picked up on the next pin, however it looks
>> possible for the vma to get nuked but still be present on this per vm
>> pin_list leading to corruption. An alternative might be then to instead
>> just remove the link when destroying the vma.
>>
>> Fixes: ed2bdf3b264d ("drm/xe/vm: Subclass userptr vmas")
>> Suggested-by: Matthew Brost <matthew.brost at intel.com>
>> Signed-off-by: Matthew Auld <matthew.auld at intel.com>
>> Cc: Thomas Hellström <thomas.hellstrom at linux.intel.com>
>> Cc: <stable at vger.kernel.org> # v6.8+
>> ---
>>   drivers/gpu/drm/xe/xe_vm.c | 26 +++++++++++++++++++-------
>>   1 file changed, 19 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
>> index d664f2e418b2..668b0bde7822 100644
>> --- a/drivers/gpu/drm/xe/xe_vm.c
>> +++ b/drivers/gpu/drm/xe/xe_vm.c
>> @@ -670,12 +670,12 @@ int xe_vm_userptr_pin(struct xe_vm *vm)
>>   	list_for_each_entry_safe(uvma, next, &vm->userptr.invalidated,
>>   				 userptr.invalidate_link) {
>>   		list_del_init(&uvma->userptr.invalidate_link);
>> -		list_move_tail(&uvma->userptr.repin_link,
>> -			       &vm->userptr.repin_list);
>> +		list_add_tail(&uvma->userptr.repin_link,
>> +			      &vm->userptr.repin_list);
> 
> Why this change?

Just that with this patch the repin_link should now always be empty at 
this point, I think. add should complain if that is not the case.

> 
>>   	}
>>   	spin_unlock(&vm->userptr.invalidated_lock);
>>   
>> -	/* Pin and move to temporary list */
>> +	/* Pin and move to bind list */
>>   	list_for_each_entry_safe(uvma, next, &vm->userptr.repin_list,
>>   				 userptr.repin_link) {
>>   		err = xe_vma_userptr_pin_pages(uvma);
>> @@ -691,10 +691,10 @@ int xe_vm_userptr_pin(struct xe_vm *vm)
>>   			err = xe_vm_invalidate_vma(&uvma->vma);
>>   			xe_vm_unlock(vm);
>>   			if (err)
>> -				return err;
>> +				break;
>>   		} else {
>> -			if (err < 0)
>> -				return err;
>> +			if (err)
>> +				break;
>>   
>>   			list_del_init(&uvma->userptr.repin_link);
>>   			list_move_tail(&uvma->vma.combined_links.rebind,
>> @@ -702,7 +702,19 @@ int xe_vm_userptr_pin(struct xe_vm *vm)
>>   		}
>>   	}
>>   
>> -	return 0;
>> +	if (err) {
>> +		down_write(&vm->userptr.notifier_lock);
> 
> Can you explain why you take the notifier lock here? I don't think this
> required unless I'm missing something.

For the invalidated list, the docs say:

"Removing items from the list additionally requires @lock in write mode, 
and adding items to the list requires the @userptr.notifer_lock in write 
mode."

Not sure if the docs needs to be updated here?

> 
> Matt
> 
>> +		spin_lock(&vm->userptr.invalidated_lock);
>> +		list_for_each_entry_safe(uvma, next, &vm->userptr.repin_list,
>> +					 userptr.repin_link) {
>> +			list_del_init(&uvma->userptr.repin_link);
>> +			list_move_tail(&uvma->userptr.invalidate_link,
>> +				       &vm->userptr.invalidated);
>> +		}
>> +		spin_unlock(&vm->userptr.invalidated_lock);
>> +		up_write(&vm->userptr.notifier_lock);
>> +	}
>> +	return err;
>>   }
>>   
>>   /**
>> -- 
>> 2.48.1
>>



More information about the Intel-xe mailing list