[PATCH v4 1/5] drm: Move some options to separate new Kconfig

Mon Mar 10 11:54:23 UTC 2025

On 10/03/2025 11:11, Philipp Stanner wrote:
> On Mon, 2025-03-10 at 09:55 +0000, Tvrtko Ursulin wrote:
>>
>> On 07/03/2025 18:06, Philipp Stanner wrote:
>>> On Fri, 2025-03-07 at 16:59 +0000, Tvrtko Ursulin wrote:
>>>>
>>>> On 07/03/2025 13:41, Philipp Stanner wrote:
>>>>> Hi,
>>>>>
>>>>> You forgot to put folks in CC as recipents for the cover letter
>>>>> :(
>>>>>
>>>>>
>>>>> On Thu, 2025-03-06 at 17:05 +0000, Tvrtko Ursulin wrote:
>>>>>> Move some options out into a new debug specific kconfig file
>>>>>> in
>>>>>> order
>>>>>> to
>>>>>> make things a bit cleaner.
>>>>>>
>>>>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at igalia.com>
>>>>>> Cc: Christian König <christian.koenig at amd.com>
>>>>>> Cc: Danilo Krummrich <dakr at kernel.org>
>>>>>> Cc: Matthew Brost <matthew.brost at intel.com>
>>>>>> Cc: Philipp Stanner <phasta at kernel.org>
>>>>>
>>>>> We all have our individual work flows, so don't take this as
>>>>> lecturing
>>>>> or anything – I just suspect that I was forgotten in the cover
>>>>> letter
>>>>> because you Cc people by hand in the individual patches.
>>>>>
>>>>> What I do is that I run get_maintainer and then put the
>>>>> individuals
>>>>> listed there into the --to= field. That sends the entire series
>>>>> to
>>>>> all
>>>>> of them.
>>>>>
>>>>> Only sometimes, when there's a huge list of recipents or when
>>>>> the
>>>>> patches of a series are very independent, I deviate from that
>>>>> rule.
>>>>>
>>>>> JFYI
>>>>
>>>> Notice it was there in v3, I just omitted to paste it this time.
>>>>
>>>>> Anyways, we have a bigger problem about the entire series. I
>>>>> now
>>>>> tested
>>>>> again with the same setup as yesterday and the faults are
>>>>> indeed
>>>>> gone,
>>>>> so that's good.
>>>>>
>>>>> But to be sure I then did run kmemleak and got a list of leaks
>>>>> that
>>>>> is
>>>>> more than 2000 lines long.
>>>>
>>>> There is this comment for drm_sched_fini which ends with:
>>>>
>>>> """
>>>> ...
>>>>     * This stops submission of new jobs to the hardware through
>>>>     * drm_sched_backend_ops.run_job(). Consequently,
>>>> drm_sched_backend_ops.free_job()
>>>>     * will not be called for all jobs still in
>>>> drm_gpu_scheduler.pending_list.
>>>>     * There is no solution for this currently. Thus, it is up to
>>>> the
>>>> driver to make
>>>>     * sure that:
>>>>     *
>>>>     *  a) drm_sched_fini() is only called after for all submitted
>>>> jobs
>>>>     *     drm_sched_backend_ops.free_job() has been called or that
>>>>     *  b) the jobs for which drm_sched_backend_ops.free_job() has
>>>> not
>>>> been
>>>> called
>>>>     *     after drm_sched_fini() ran are freed manually.
>>>>     *
>>>>
>>>>     * FIXME: Take care of the above problem and prevent this
>>>> function
>>>> from
>>>> leaking
>>>>     * the jobs in drm_gpu_scheduler.pending_list under any
>>>> circumstances.
>>>> """
>>>>
>>>> I got bitten by that. Keep forgetting how fragile the thing is..
>>>> :(
>>>
>>> argh damn, those are *all* from the pending_list?!
>>
>> Right, all leaks I saw were from the drm_sched_basic_entity_cleanup
>> test. All other tests actually wait for jobs to finish so can't hit
>> that.
>>
>> Fix was simply to add a drm_sched_job_cleanup call when unwinding
>> unfinished mock scheduler jobs from drm_mock_sched_fini, which
>> happens
>> before calling drm_sched_fini.
>>
>> That's pretty much how things are expected to be handled AFAIU.
>>
>>> OK. Well.
>>>
>>> Now we've got a philosophical problem:
>>>
>>> We still have to fix those leaks (I'm still working on it, but my
>>> current attempt has failed and I probably fall back to a refcount
>>> solution).
>>
>> You propose to move the responsibility of cleaning up in-flight jobs
>> to
>> the scheduler core?
> 
> The scheduler core is already and has always been responsible for
> cleaning up "in-flight jobs". It does so through
> backend_ops.free_job(). And we prevent it from cleaning up all jobs by
> cancelling the work items in drm_sched_fini().
> 
> Semantically, the scheduler is the one in charge of the job life times.
> 
> As of right now, every single driver is effectively forced to implement
> the same logic, but they have implemented it in different ways (Xe
> refcounts the scheduler and only calls drm_sched_fini() once refcnt ==
> 0, Nouveau maintains a copy of the pending_list, blocking for it to
> become empty before calling drm_sched_fini())

Right. And to change it means making ->free_job() for all drivers handle 
different potential job states, while today it only needs to handle 
finished jobs. Or adding a new vfunc. Or something. It sounds doable but 
a lot of work, not least because there is a lot of drivers.

>>> And to see whether the fix actually fixes the leaks, directly using
>>> the
>>> kunit tests would be handy.
>>>
>>> After all, this is what the kunit tests are there for: show what is
>>> broken within the scheduler. And those leaks definitely qualify. Or
>>> should kunit tests follow the same rules we demand from drivers?
>>>
>>> I'd like to hear more opinions about that.
>>>
>>> @Danilo, @Dave, @Sima
>>> would it be OK if we add kunit tests for the scheduler to DRM that
>>> cause leaks until we can fix them?
>>
>> It is indeed a bit philosophical. I'd say only if there is a 100%
>> agreement that drm_sched_fini should be able to clean up, or drive
>> cleaning up, all driver state. And if we are prepared to handle a
>> permanently failing test from now to some future date when this would
>> be
>> implemented.
>>
>> I have a similar conundrum with set priority, where I was
>> contemplating
>> to add a permanently failing test showing how that does not fully
>> work,
>> and then get improved with my deadline scheduling series.
>>
>> On the other side of the argument is the past experience of CI
>> systems
>> generally not coping well with permanently failing test. Eventually
>> they
>> succumb to the pressure to remove them due noisy results. Therefore
>> other option is to have the mock scheduler adhere to the current
>> implementation and only change it once the DRM scheduler rules
>> change.
> 
> Can you think of a way, like flags or kconfig options, with which
> developers such as you and I could "switch the bugs on" for working on
> those issues?

We could do that easily I think. Something like:

config DRM_SCHED_KUNIT_TEST_ASPIRATIONAL
         bool "Turn on the aspirational mode for DRM scheduler unit 
tests" if !KUNIT_ALL_TESTS
         select DRM_SCHED
         depends on DRM && KUNIT && DRM_SCHED_KUNIT_TEST
         default KUNIT_ALL_TESTS
         help
           Choose this option to make the DRM scheduler unit tests test 
for behaviour which was agreed as a design goal, even if the current 
implementation can make specific tests fail.

           Recommended for driver developers only.

           If in doubt, say "N".

I can skip the job cleanup based on it and also add some validation that 
the pending list is empty after drm_sched_fini if on.

Regards,

Tvrtko