[PATCH v5 5/5] drm/amdgpu: move PD/PT bos on LRU again
Michel Dänzer
michel at daenzer.net
Wed Aug 29 09:00:41 UTC 2018
On 2018-08-29 10:57 a.m., Christian König wrote:
> Am 29.08.2018 um 09:52 schrieb Michel Dänzer:
>> On 2018-08-28 7:03 p.m., Michel Dänzer wrote:
>>> On 2018-08-28 11:14 a.m., Michel Dänzer wrote:
>>>> On 2018-08-22 9:52 a.m., Huang Rui wrote:
>>>>> The new bulk moving functionality is ready, the overhead of moving
>>>>> PD/PT bos to
>>>>> LRU is fixed. So move them on LRU again.
>>>>>
>>>>> Signed-off-by: Huang Rui <ray.huang at amd.com>
>>>>> Tested-by: Mike Lothian <mike at fireburn.co.uk>
>>>>> Tested-by: Dieter Nützel <Dieter at nuetzel-hh.de>
>>>>> Acked-by: Chunming Zhou <david1.zhou at amd.com>
>>>>> Reviewed-by: Junwei Zhang <Jerry.Zhang at amd.com>
>>>>> ---
>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 +-
>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>> index db1f28a..d195a3d 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>> @@ -1107,7 +1107,7 @@ int amdgpu_vm_update_directories(struct
>>>>> amdgpu_device *adev,
>>>>> struct amdgpu_vm_bo_base,
>>>>> vm_status);
>>>>> bo_base->moved = false;
>>>>> - list_del_init(&bo_base->vm_status);
>>>>> + list_move(&bo_base->vm_status, &vm->idle);
>>>>> bo = bo_base->bo->parent;
>>>>> if (!bo)
>>>>>
>>>> Since this change, I'm getting various badness when running piglit
>>>> using
>>>> radeonsi on Bonaire, see the attached dmesg excerpt.
>>>>
>>>> Reverting just this change on top of current amd-staging-drm-next
>>>> avoids
>>>> the problem.
>>>>
>>>> Looks like some list manipulation isn't sufficiently protected against
>>>> concurrent execution?
>>> KASAN pointed me to one issue:
>>> https://patchwork.freedesktop.org/patch/246212/
>>>
>>> However, this doesn't fully fix the problem.
>> Ray, any ideas yet for solving this? If not, let's revert this change
>> for now.
>
> I've gone over this multiple times now as well, but can't find anything
> obvious wrong either.
Thanks for looking into it.
> If we don't have any more ideas I would say revert it for now and try to
> debug it further.
Yep.
> BTW: Any idea how to force the issue?
Not specifically. It happens reliably and pretty quickly for me when
running the piglit gpu profile.
--
Earthling Michel Dänzer | http://www.amd.com
Libre software enthusiast | Mesa and X developer
More information about the amd-gfx
mailing list