[PATCH] drm/amdgpu: guard ib scheduling while in reset
Grodzovsky, Andrey
Andrey.Grodzovsky at amd.com
Wed Oct 30 14:44:00 UTC 2019
That good as proof of RCA but I still think we should grab a dedicated
lock inside scheduler since the race is internal to scheduler code so
this better to handle it inside the scheduler code to make the fix apply
for all drivers using it.
Andrey
On 10/30/19 4:44 AM, S, Shirish wrote:
>>>>
>>>> We still have it and isn't doing kthread_park()/unpark() from
>>>> drm_sched_entity_fini while GPU reset in progress defeats all the
>>>> purpose of drm_sched_stop->kthread_park ? If
>>>> drm_sched_entity_fini-> kthread_unpark happens AFTER
>>>> drm_sched_stop->kthread_park nothing prevents from another (third)
>>>> thread keep submitting job to HW which will be picked up by the
>>>> unparked scheduler thread try to submit to HW but fail because the
>>>> HW ring is deactivated.
>>>>
>>>> If so maybe we should serialize calls to
>>>> kthread_park/unpark(sched->thread) ?
>>>>
>>>
>>> Yeah, that was my thinking as well. Probably best to just grab the
>>> reset lock before calling drm_sched_entity_fini().
>>
>>
>> Shirish - please try locking &adev->lock_reset around calls to
>> drm_sched_entity_fini as Christian suggests and see if this actually
>> helps the issue.
>>
> Yes that also works.
>
> Regards,
>
More information about the amd-gfx
mailing list