[PATCH 10/11] drm/scheduler: Don't store self-dependencies

Thu Jun 24 17:43:59 UTC 2021

On Thu, Jun 24, 2021 at 7:38 PM Christian König
<christian.koenig at amd.com> wrote:
>
> Am 24.06.21 um 19:29 schrieb Daniel Vetter:
> > On Thu, Jun 24, 2021 at 07:03:10PM +0200, Christian König wrote:
> >> Am 24.06.21 um 16:00 schrieb Daniel Vetter:
> >>> This is essentially part of drm_sched_dependency_optimized(), which
> >>> only amdgpu seems to make use of. Use it a bit more.
> >>>
> >>> Signed-off-by: Daniel Vetter <daniel.vetter at intel.com>
> >>> Cc: "Christian König" <christian.koenig at amd.com>
> >>> Cc: Daniel Vetter <daniel.vetter at ffwll.ch>
> >>> Cc: Luben Tuikov <luben.tuikov at amd.com>
> >>> Cc: Andrey Grodzovsky <andrey.grodzovsky at amd.com>
> >>> Cc: Alex Deucher <alexander.deucher at amd.com>
> >>> Cc: Jack Zhang <Jack.Zhang1 at amd.com>
> >>> ---
> >>>    drivers/gpu/drm/scheduler/sched_main.c | 7 +++++++
> >>>    1 file changed, 7 insertions(+)
> >>>
> >>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> >>> index 370c336d383f..c31d7cf7df74 100644
> >>> --- a/drivers/gpu/drm/scheduler/sched_main.c
> >>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> >>> @@ -649,6 +649,13 @@ int drm_sched_job_await_fence(struct drm_sched_job *job,
> >>>     if (!fence)
> >>>             return 0;
> >>> +   /* if it's a fence from us it's guaranteed to be earlier */
> >>> +   if (fence->context == job->entity->fence_context ||
> >>> +       fence->context == job->entity->fence_context + 1) {
> >>> +           dma_fence_put(fence);
> >>> +           return 0;
> >>> +   }
> >>> +
> >> Well NAK. That would break Vulkan.

I'm assuming your reply means the NAK is retracted and was just the
usual "this doesn't perfectly fit for amdgpu" reflex?

> >> The problem is that Vulkan can insert dependencies between jobs which run on
> >> the same queue.
> >>
> >> So we need to track those as well and if the previous job for the same
> >> queue/scheduler is not yet finished a pipeline synchronization needs to be
> >> inserted.
> >>
> >> That's one of the reasons we wasn't able to unify the dependency handling
> >> yet.
> > That sounds like an extremely amdgpu specific constraint?
>
> Yeah, that's totally hardware specific.
>
> It's just that I don't know how else we could track that without having
> the same separation as in amdgpu between implicit and explicit fences.
> And as far as I understand it that's exactly what you want to avoid.
>
> As I said this turned out to be really awkward.
>
> > You're also the
> > only one who keeps track of whether the previous job we've scheduled has
> > finished already (I guess they can get pipelined and you don't flush by
> > default), so you insert fences.
>
> Yes, exactly that.
>
> > I guess we can add a await_fence_no_dedup or so for amdgpu, but I'm not
> > sure why we have to inflict this design constraint on all other drivers?
> > At least I'm not seeing anything in lima, panfrost, v3d or entaviv that
> > would break with this, and i915 will also be perfectly fine.
> >
> > Also note: I'm not using this for amdgpu, exactly because there's a few
> > funny things going on.
>
> Yeah, exactly the reason why we never unified this.

Yeah there's clear limits to this, because you also can't use the
await_implicit helper, because you have to keep filtering for owner or
the current amdgpu uapi goes horribly slow. I think the benefit would
be just that we could share the datastructure and the book-keeping,
but aside from that you'd need your own integration in amdgpu.

One idea I just had was whether we could use the tag bits xarray has
for the amdgpu purposed. Like we could do a
drm_sched_job_await_fence_tagged, where you supply additional
information (like the "this might be relevant for the vm_flush" and
things like that). Afaiui xarray tags are very fast to enumerate on if
you're looking for specific tags, but I might be wrong. Ideally this
would avoid the need for the duplicated amdgpu_job->sched.

Cheers, Daniel

> Regards,
> Christian.
>
> > Finally: You _really_ need explicit dependency handling for vulkan in your
> > uapi, instead of the kernel second-guessing what userspace might be doing.
> > That's really not how vulkan is designed to work :-)
>
> >
> > Cheers, Daniel
> >
> >
> >> Christian.
> >>
> >>>     /* Deduplicate if we already depend on a fence from the same context.
> >>>      * This lets the size of the array of deps scale with the number of
> >>>      * engines involved, rather than the number of BOs.
>

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch