[PATCH 10/11] drm/scheduler: Don't store self-dependencies

Thu Jun 24 17:38:14 UTC 2021

Am 24.06.21 um 19:29 schrieb Daniel Vetter:
> On Thu, Jun 24, 2021 at 07:03:10PM +0200, Christian König wrote:
>> Am 24.06.21 um 16:00 schrieb Daniel Vetter:
>>> This is essentially part of drm_sched_dependency_optimized(), which
>>> only amdgpu seems to make use of. Use it a bit more.
>>>
>>> Signed-off-by: Daniel Vetter <daniel.vetter at intel.com>
>>> Cc: "Christian König" <christian.koenig at amd.com>
>>> Cc: Daniel Vetter <daniel.vetter at ffwll.ch>
>>> Cc: Luben Tuikov <luben.tuikov at amd.com>
>>> Cc: Andrey Grodzovsky <andrey.grodzovsky at amd.com>
>>> Cc: Alex Deucher <alexander.deucher at amd.com>
>>> Cc: Jack Zhang <Jack.Zhang1 at amd.com>
>>> ---
>>>    drivers/gpu/drm/scheduler/sched_main.c | 7 +++++++
>>>    1 file changed, 7 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>> index 370c336d383f..c31d7cf7df74 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>> @@ -649,6 +649,13 @@ int drm_sched_job_await_fence(struct drm_sched_job *job,
>>>    	if (!fence)
>>>    		return 0;
>>> +	/* if it's a fence from us it's guaranteed to be earlier */
>>> +	if (fence->context == job->entity->fence_context ||
>>> +	    fence->context == job->entity->fence_context + 1) {
>>> +		dma_fence_put(fence);
>>> +		return 0;
>>> +	}
>>> +
>> Well NAK. That would break Vulkan.
>>
>> The problem is that Vulkan can insert dependencies between jobs which run on
>> the same queue.
>>
>> So we need to track those as well and if the previous job for the same
>> queue/scheduler is not yet finished a pipeline synchronization needs to be
>> inserted.
>>
>> That's one of the reasons we wasn't able to unify the dependency handling
>> yet.
> That sounds like an extremely amdgpu specific constraint?

Yeah, that's totally hardware specific.

It's just that I don't know how else we could track that without having 
the same separation as in amdgpu between implicit and explicit fences. 
And as far as I understand it that's exactly what you want to avoid.

As I said this turned out to be really awkward.

> You're also the
> only one who keeps track of whether the previous job we've scheduled has
> finished already (I guess they can get pipelined and you don't flush by
> default), so you insert fences.

Yes, exactly that.

> I guess we can add a await_fence_no_dedup or so for amdgpu, but I'm not
> sure why we have to inflict this design constraint on all other drivers?
> At least I'm not seeing anything in lima, panfrost, v3d or entaviv that
> would break with this, and i915 will also be perfectly fine.
>
> Also note: I'm not using this for amdgpu, exactly because there's a few
> funny things going on.

Yeah, exactly the reason why we never unified this.

Regards,
Christian.

> Finally: You _really_ need explicit dependency handling for vulkan in your
> uapi, instead of the kernel second-guessing what userspace might be doing.
> That's really not how vulkan is designed to work :-)

>
> Cheers, Daniel
>
>
>> Christian.
>>
>>>    	/* Deduplicate if we already depend on a fence from the same context.
>>>    	 * This lets the size of the array of deps scale with the number of
>>>    	 * engines involved, rather than the number of BOs.