[PATCH v2 1/3] drm/xe/userptr: restore invalidation list on error
Matthew Auld
matthew.auld at intel.com
Mon Feb 17 09:38:26 UTC 2025
On 15/02/2025 01:28, Matthew Brost wrote:
> On Fri, Feb 14, 2025 at 05:05:28PM +0000, Matthew Auld wrote:
>> On error restore anything still on the pin_list back to the invalidation
>> list on error. For the actual pin, so long as the vma is tracked on
>> either list it should get picked up on the next pin, however it looks
>> possible for the vma to get nuked but still be present on this per vm
>> pin_list leading to corruption. An alternative might be then to instead
>> just remove the link when destroying the vma.
>>
>> Fixes: ed2bdf3b264d ("drm/xe/vm: Subclass userptr vmas")
>> Suggested-by: Matthew Brost <matthew.brost at intel.com>
>> Signed-off-by: Matthew Auld <matthew.auld at intel.com>
>> Cc: Thomas Hellström <thomas.hellstrom at linux.intel.com>
>> Cc: <stable at vger.kernel.org> # v6.8+
>> ---
>> drivers/gpu/drm/xe/xe_vm.c | 26 +++++++++++++++++++-------
>> 1 file changed, 19 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
>> index d664f2e418b2..668b0bde7822 100644
>> --- a/drivers/gpu/drm/xe/xe_vm.c
>> +++ b/drivers/gpu/drm/xe/xe_vm.c
>> @@ -670,12 +670,12 @@ int xe_vm_userptr_pin(struct xe_vm *vm)
>> list_for_each_entry_safe(uvma, next, &vm->userptr.invalidated,
>> userptr.invalidate_link) {
>> list_del_init(&uvma->userptr.invalidate_link);
>> - list_move_tail(&uvma->userptr.repin_link,
>> - &vm->userptr.repin_list);
>> + list_add_tail(&uvma->userptr.repin_link,
>> + &vm->userptr.repin_list);
>
> Why this change?
Just that with this patch the repin_link should now always be empty at
this point, I think. add should complain if that is not the case.
>
>> }
>> spin_unlock(&vm->userptr.invalidated_lock);
>>
>> - /* Pin and move to temporary list */
>> + /* Pin and move to bind list */
>> list_for_each_entry_safe(uvma, next, &vm->userptr.repin_list,
>> userptr.repin_link) {
>> err = xe_vma_userptr_pin_pages(uvma);
>> @@ -691,10 +691,10 @@ int xe_vm_userptr_pin(struct xe_vm *vm)
>> err = xe_vm_invalidate_vma(&uvma->vma);
>> xe_vm_unlock(vm);
>> if (err)
>> - return err;
>> + break;
>> } else {
>> - if (err < 0)
>> - return err;
>> + if (err)
>> + break;
>>
>> list_del_init(&uvma->userptr.repin_link);
>> list_move_tail(&uvma->vma.combined_links.rebind,
>> @@ -702,7 +702,19 @@ int xe_vm_userptr_pin(struct xe_vm *vm)
>> }
>> }
>>
>> - return 0;
>> + if (err) {
>> + down_write(&vm->userptr.notifier_lock);
>
> Can you explain why you take the notifier lock here? I don't think this
> required unless I'm missing something.
For the invalidated list, the docs say:
"Removing items from the list additionally requires @lock in write mode,
and adding items to the list requires the @userptr.notifer_lock in write
mode."
Not sure if the docs needs to be updated here?
>
> Matt
>
>> + spin_lock(&vm->userptr.invalidated_lock);
>> + list_for_each_entry_safe(uvma, next, &vm->userptr.repin_list,
>> + userptr.repin_link) {
>> + list_del_init(&uvma->userptr.repin_link);
>> + list_move_tail(&uvma->userptr.invalidate_link,
>> + &vm->userptr.invalidated);
>> + }
>> + spin_unlock(&vm->userptr.invalidated_lock);
>> + up_write(&vm->userptr.notifier_lock);
>> + }
>> + return err;
>> }
>>
>> /**
>> --
>> 2.48.1
>>
More information about the Intel-xe
mailing list