[PATCH] drm/amdkfd: Fix a race condition of vram buffer unref in svm code

Chen, Xiaogang xiaogang.chen at amd.com
Wed Sep 27 15:10:04 UTC 2023


On 9/27/2023 9:19 AM, Eric Huang wrote:
> Caution: This message originated from an External Source. Use proper 
> caution when opening attachments, clicking links, or responding.
>
>
> On 2023-09-26 23:00, Xiaogang.Chen wrote:
>> From: Xiaogang Chen <xiaogang.chen at amd.com>
>>
>> prange->svm_bo unref can happen in both mmu callback and a callback 
>> after
>> migrate to system ram. Both are async call in different tasks. Sync 
>> svm_bo
>> unref operation to avoid random "use-after-free".
>>
>> Signed-off-by: Xiaogang.Chen <Xiaogang.Chen at amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 9 +++++++++
>>   1 file changed, 9 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
>> b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>> index 70aa882636ab..8e246e848018 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
>> @@ -637,6 +637,15 @@ void svm_range_vram_node_free(struct svm_range 
>> *prange)
>>   {
>>       svm_range_bo_unref(prange->svm_bo);
>>       prange->ttm_res = NULL;
> Are above two lines not removed?

you are right, It was caused by copy-paste when edit. I tested it 
without these two lines. I will remove these two lines when send to gerrit.

Thanks

Xiaogang

>
> Regards,
> Eric
>> +     /* serialize prange->svm_bo unref */
>> +     mutex_lock(&prange->lock);
>> +     /* prange->svm_bo has not been unref */
>> +     if (prange->ttm_res) {
>> +             prange->ttm_res = NULL;
>> +             mutex_unlock(&prange->lock);
>> +             svm_range_bo_unref(prange->svm_bo);
>> +     } else
>> +             mutex_unlock(&prange->lock);
>>   }
>>
>>   struct kfd_node *
>


More information about the amd-gfx mailing list