[PATCH 12/13] drm/scheduler: rework entity flush, kill and fini
Dmitry Osipenko
dmitry.osipenko at collabora.com
Thu Nov 17 13:00:02 UTC 2022
On 11/17/22 15:59, Dmitry Osipenko wrote:
> On 11/17/22 15:55, Christian König wrote:
>> Am 17.11.22 um 13:47 schrieb Dmitry Osipenko:
>>> On 11/17/22 12:53, Christian König wrote:
>>>> Am 17.11.22 um 03:36 schrieb Dmitry Osipenko:
>>>>> Hi,
>>>>>
>>>>> On 10/14/22 11:46, Christian König wrote:
>>>>>> +/* Remove the entity from the scheduler and kill all pending jobs */
>>>>>> +static void drm_sched_entity_kill(struct drm_sched_entity *entity)
>>>>>> +{
>>>>>> + struct drm_sched_job *job;
>>>>>> + struct dma_fence *prev;
>>>>>> +
>>>>>> + if (!entity->rq)
>>>>>> + return;
>>>>>> +
>>>>>> + spin_lock(&entity->rq_lock);
>>>>>> + entity->stopped = true;
>>>>>> + drm_sched_rq_remove_entity(entity->rq, entity);
>>>>>> + spin_unlock(&entity->rq_lock);
>>>>>> +
>>>>>> + /* Make sure this entity is not used by the scheduler at the
>>>>>> moment */
>>>>>> + wait_for_completion(&entity->entity_idle);
>>>>> I'm always hitting lockup here using Panfrost driver on terminating
>>>>> Xorg. Revering this patch helps. Any ideas how to fix it?
>>>>>
>>>> Well is the entity idle or are there some unsubmitted jobs left?
>>> Do you mean unsubmitted to h/w? IIUC, there are unsubmitted jobs left.
>>>
>>> I see that there are 5-6 incomplete (in-flight) jobs when
>>> panfrost_job_close() is invoked.
>>>
>>> There are 1-2 jobs that are constantly scheduled and finished once in a
>>> few seconds after the lockup happens.
>>
>> Well what drm_sched_entity_kill() is supposed to do is to prevent
>> pushing queued up stuff to the hw when the process which queued it is
>> killed. Is the process really killed or is that just some incorrect
>> handling?
>
> It's actually 5-6 incomplete jobs of Xorg that are hanging when Xorg
> process is closed.
>
> The two re-scheduled jobs are from sddm, so it's only the Xorg context
> that hangs.
>
>> In other words I see two possibilities here, either we have a bug in the
>> scheduler or panfrost isn't using it correctly.
>>
>> Does panfrost calls drm_sched_entity_flush() before it calls
>> drm_sched_entity_fini()? (I don't have the driver source at hand at the
>> moment).
>
> Panfrost doesn't use drm_sched_entity_flush(), nor drm_sched_entity_flush().
*nor drm_sched_entity_fini()
--
Best regards,
Dmitry
More information about the amd-gfx
mailing list