[PATCH 12/13] drm/scheduler: rework entity flush, kill and fini

Dmitry Osipenko dmitry.osipenko at collabora.com
Thu Nov 17 12:59:10 UTC 2022


On 11/17/22 15:55, Christian König wrote:
> Am 17.11.22 um 13:47 schrieb Dmitry Osipenko:
>> On 11/17/22 12:53, Christian König wrote:
>>> Am 17.11.22 um 03:36 schrieb Dmitry Osipenko:
>>>> Hi,
>>>>
>>>> On 10/14/22 11:46, Christian König wrote:
>>>>> +/* Remove the entity from the scheduler and kill all pending jobs */
>>>>> +static void drm_sched_entity_kill(struct drm_sched_entity *entity)
>>>>> +{
>>>>> +    struct drm_sched_job *job;
>>>>> +    struct dma_fence *prev;
>>>>> +
>>>>> +    if (!entity->rq)
>>>>> +        return;
>>>>> +
>>>>> +    spin_lock(&entity->rq_lock);
>>>>> +    entity->stopped = true;
>>>>> +    drm_sched_rq_remove_entity(entity->rq, entity);
>>>>> +    spin_unlock(&entity->rq_lock);
>>>>> +
>>>>> +    /* Make sure this entity is not used by the scheduler at the
>>>>> moment */
>>>>> +    wait_for_completion(&entity->entity_idle);
>>>> I'm always hitting lockup here using Panfrost driver on terminating
>>>> Xorg. Revering this patch helps. Any ideas how to fix it?
>>>>
>>> Well is the entity idle or are there some unsubmitted jobs left?
>> Do you mean unsubmitted to h/w? IIUC, there are unsubmitted jobs left.
>>
>> I see that there are 5-6 incomplete (in-flight) jobs when
>> panfrost_job_close() is invoked.
>>
>> There are 1-2 jobs that are constantly scheduled and finished once in a
>> few seconds after the lockup happens.
> 
> Well what drm_sched_entity_kill() is supposed to do is to prevent
> pushing queued up stuff to the hw when the process which queued it is
> killed. Is the process really killed or is that just some incorrect
> handling?

It's actually 5-6 incomplete jobs of Xorg that are hanging when Xorg
process is closed.

The two re-scheduled jobs are from sddm, so it's only the Xorg context
that hangs.

> In other words I see two possibilities here, either we have a bug in the
> scheduler or panfrost isn't using it correctly.
> 
> Does panfrost calls drm_sched_entity_flush() before it calls
> drm_sched_entity_fini()? (I don't have the driver source at hand at the
> moment).

Panfrost doesn't use drm_sched_entity_flush(), nor drm_sched_entity_flush().

-- 
Best regards,
Dmitry



More information about the dri-devel mailing list