[PATCH] drm/scheduler: Fix drm_sched_entity_set_priority()
Christian König
ckoenig.leichtzumerken at gmail.com
Wed Jul 24 11:16:11 UTC 2024
Am 24.07.24 um 10:16 schrieb Tvrtko Ursulin:
> [SNIP]
>>
>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>> Absolutely.
>
> Absolutely good and absolutely me, or absolutely you? :)
You, I don't even have time to finish all the stuff I already started :/
>
> These are the TODO points and their opens:
>
> - Adjust amdgpu_ctx_set_entity_priority() to call
> drm_sched_entity_modify_sched() regardless of the hw_type - to fix
> priority changes on a single sched other than gfx or compute.
Either that or to stop using the scheduler priority to implement
userspace priorities and always use different HW queues for that.
>
> - Document sched_list array lifetime must align with the entity and
> adjust the callers.
>
> Open:
>
> Do you still oppose keeping sched_list for num_scheds == 1?
Not if you can fix up all callers.
> If so do you propose drm_sched_entity_modify_sched() keeps disagreeing
> with drm_sched_entity_init() on this detail? And keep the "one shot
> single sched_list" quirk in? Why is that nicer than simply keeping the
> list and remove that quirk? Once lifetime rules are clear it IMO is
> okay to always keep the list.
Yeah if every caller of drm_sched_entity_init() can be fixed I'm fine
with that as well.
>
> - Remove drm_sched_entity_set_priority().
>
> Open:
>
> Should we at this point also modify amdgpu_device_init_schedulers() to
> stop initialising schedulers with DRM_SCHED_PRIORITY_COUNT run queues?
One step at a time.
>
> Once drm_sched_entity_set_priority() is gone there is little point.
> Unless there are some non-obvious games with the kernel priority or
> something.
>
>>>>>> In general scheduler priorities were meant to be used for things
>>>>>> like kernel queues which would always have higher priority than
>>>>>> user space submissions and using them for userspace turned out to
>>>>>> be not such a good idea.
>>>>>
>>>>> Out of curiousity what were the problems? I cannot think of
>>>>> anything fundamental since there are priorities at the backend
>>>>> level after all, just fewer levels.
>>>>
>>>> A higher level queue can starve lower level queues to the point
>>>> that they never get a chance to get anything running.
>>>
>>> Oh that.. well I call that implementation details. :) Because
>>> nowhere in the uapi is I think guaranteed execution ordering needs
>>> to be strictly in descending priority. This potentially goes back to
>>> what you said above that a potential larger rewrite might be
>>> beneficial. Implementing some smarter scheduling. Although the issue
>>> will be it is just frontend scheduling after all. So a bit
>>> questionable to invest in making it too smart.
>>
>> +1 and we have a bug report complaining that RR is in at least one
>> situation better than FIFO. So it is actually quite hard to remove.
>>
>> On the other hand FIFO has some really nice properties as well. E.g.
>> try to run >100 glxgears instances (on weaker hw) and just visually
>> compare the result differences between RR and FIFO. FIFO is baby
>> smooth and RR is basically stuttering all the time.
>>
>>> I think this goes more back to what I was suggesting during early xe
>>> days, that potentially drm scheduler should be split into dependency
>>> handling part and the scheduler part. Drivers with 1:1
>>> entity:scheduler and full hardware/firmware scheduling do not really
>>> need neither fifo or rr.
>>
>> Yeah that's my thinking as well and I also suggested that multiple
>> times in discussions with Sima and others.
>>
>>>
>>>> This basically means that userspace gets a chance to submit
>>>> infinity fences with all the bad consequences.
>>>>
>>>>>
>>>>> I mean one problem unrelated to this discussion is this:
>>>>>
>>>>> void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
>>>>> {
>>>>> struct dma_fence *fence;
>>>>> struct drm_gpu_scheduler *sched;
>>>>> struct drm_sched_rq *rq;
>>>>>
>>>>> /* queue non-empty, stay on the same engine */
>>>>> if (spsc_queue_count(&entity->job_queue))
>>>>> return;
>>>>>
>>>>> Which makes it look like the entity with a constant trickle of
>>>>> jobs will never actually manage to change it's run queue. Neither
>>>>> today, nor after the potential drm_sched_entity_set_priority()
>>>>> removal.
>>>>
>>>> That's intentional and based on how the scheduler load balancing
>>>> works.
>>>
>>> I see that it is intentional but if it can silently prevent priority
>>> changes (even hw priority) it is not very solid. Unless I am missing
>>> something here.
>>
>> IIRC the GSoC student who implemented this (with me as mentor)
>> actually documented that behavior somewhere.
>>
>> And to be honest it kind of makes sense because switching priorities
>> would otherwise be disruptive, e.g. you have a moment were you need
>> to drain all previous submissions with the old priority before you
>> can do new ones with the new priority.
>
> Hmmm I don't see how it makes sense. Perhaps a test case for
> AMDGPU_SCHED_OP_*_PRIORITY_OVERRIDE is missing to show how it doesn't
> work. Or at least how easy it can be defeated with callers none the
> wiser.
Ok, that needs a bit longer explanation. You don't by any chance have teams?
>
> For context I am kind of interested because I wired up amdgpu to the
> DRM cgroup controller and use priority override to de-prioritize
> certain cgroups and it kind of works. But again, it will not be great
> if a client with a constant trickle of submissions can just defeat it.
Yeah, exactly that use case is currently not possible :(
Regards,
Christian.
>
> Regards,
>
> Tvrtko
More information about the dri-devel
mailing list