[PATCH] drm/amdkfd: svm ranges creation for unregistered memory

Felix Kuehling felix.kuehling at amd.com
Thu Apr 22 13:20:33 UTC 2021


Am 2021-04-22 um 9:08 a.m. schrieb philip yang:
>
>
> On 2021-04-20 9:25 p.m., Felix Kuehling wrote:
> @@ -2251,14 +2330,34 @@ svm_range_restore_pages(struct amdgpu_device
> *adev, unsigned int pasid,
>>>>  	}
>>>>  
>>>>  	mmap_read_lock(mm);
>>>> +retry_write_locked:
>>>>  	mutex_lock(&svms->lock);
>>>>  	prange = svm_range_from_addr(svms, addr, NULL);
>>>>  	if (!prange) {
>>>>  		pr_debug("failed to find prange svms 0x%p address [0x%llx]\n",
>>>>  			 svms, addr);
>>>> -		r = -EFAULT;
>>>> -		goto out_unlock_svms;
>>>> +		if (!write_locked) {
>>>> +			/* Need the write lock to create new range with MMU notifier.
>>>> +			 * Also flush pending deferred work to make sure the interval
>>>> +			 * tree is up to date before we add a new range
>>>> +			 */
>>>> +			mutex_unlock(&svms->lock);
>>>> +			mmap_read_unlock(mm);
>>>> +			svm_range_list_lock_and_flush_work(svms, mm);
>> I think this can deadlock with a deferred worker trying to drain
>> interrupts (Philip's patch series). If we cannot flush deferred work
>> here, we need to be more careful creating new ranges to make sure they
>> don't conflict with added deferred or child ranges.
>
> It's impossible to have deadlock with deferred worker to drain
> interrupts, because drain interrupt wait for restore_pages without
> taking any lock, and restore_pages flush deferred work without taking
> any lock too.
>
The deadlock does not come from holding or waiting for locks. It comes
from the worker waiting for interrupts to drain and the interrupt
handler waiting for the worker to finish with flush_work in
svm_range_list_lock_and_flush_work. If both are waiting for each other,
neither can make progress and you have a deadlock.

Regards,
  Felix


> Regards,
>
> Philip
>
>> Regards,
>>   Felix
>>
>>
>>>> +			write_locked = true;
>>>> +			goto retry_write_locked;
>>>> +		}
>>>> +		prange = svm_range_create_unregistered_range(adev, p, mm, addr);
>>>> +		if (!prange) {
>>>> +			pr_debug("failed to create unregisterd range svms 0x%p address [0x%llx]\n",
>>>> +			svms, addr);
>>>> +			mmap_write_downgrade(mm);
>>>> +			r = -EFAULT;
>>>> +			goto out_unlock_svms;
>>>> +		}
>>>>  	}
>>>> +	if (write_locked)
>>>> +		mmap_write_downgrade(mm);
>>>>  
>>>>  	mutex_lock(&prange->migrate_mutex);
>>>>  
>>> _______________________________________________
>>> amd-gfx mailing list
>>> amd-gfx at lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx


More information about the amd-gfx mailing list