[Intel-xe] [PATCH v2 1/9] drm/sched: Convert drm scheduler to use a work queue rather than kthread
Christian König
christian.koenig at amd.com
Mon Aug 21 14:07:24 UTC 2023
Am 18.08.23 um 13:58 schrieb Danilo Krummrich:
> [SNIP]
>> I only see two possible outcomes:
>> 1. You return -EBUSY (or similar) error code indicating the the hw
>> can't receive more commands.
>> 2. Wait on previously pushed commands to be executed.
>> (3. Your driver crash because you accidentally overwrite stuff in the
>> ring buffer which is still executed. I just assume that's prevented).
>>
>> Resolution #1 with -EBUSY is actually something the UAPI should not
>> do, because your UAPI then depends on the specific timing of
>> submissions which is a really bad idea.
>>
>> Resolution #2 is usually bad because it forces the hw to run dry
>> between submission and so degrade performance.
>
> I agree, that is a good reason for at least limiting the maximum job
> size to half of the ring size.
>
> However, there could still be cases where two subsequent jobs are
> submitted with just a single IB, which as is would still block
> subsequent jobs to be pushed to the ring although there is still
> plenty of space. Depending on the (CPU) scheduler latency, such a case
> can let the HW run dry as well.
Yeah, that was intentionally not done as well. The crux here is that the
more you push to the hw the worse the scheduling granularity becomes.
It's just that neither Xe nor Nouveau relies that much on the scheduling
granularity at all (because of hw queues).
But Xe doesn't seem to need that feature and I would still try to avoid
it because the more you have pushed to the hw the harder it is to get
going again after a reset.
>
> Surely, we could just continue decrease the maximum job size even
> further, but this would result in further overhead on user and kernel
> for larger IB counts. Tracking the actual job size seems to be the
> better solution for drivers where the job size can vary over a rather
> huge range.
I strongly disagree on that. A larger ring buffer is trivial to allocate
and if userspace submissions are so small that the scheduler can't keep
up submitting them then your ring buffer size is your smallest problem.
In other words the submission overhead will completely kill your
performance and you should probably consider stuffing more into a single
submission.
Regards,
Christian.
>
> - Danilo
More information about the Intel-xe
mailing list