[RFC] Host1x/TegraDRM UAPI (sync points)

Thu Jul 9 09:28:47 UTC 2020

08.07.2020 13:06, Mikko Perttunen пишет:
> On 7/7/20 2:06 PM, Dmitry Osipenko wrote:
>> 02.07.2020 15:10, Mikko Perttunen пишет:
>>> Ok, so we would have two kinds of syncpoints for the job; one
>>> for kernel job tracking; and one that userspace can
>>> manipulate as it wants to.
>>>
>>> Could we handle the job tracking syncpoint completely inside the kernel,
>>> i.e. allocate it in kernel during job submission, and add an increment
>>> for it at the end of the job (with condition OP_DONE)? For MLOCKing, the
>>> kernel already needs to insert a SYNCPT_INCR(OP_DONE) + WAIT +
>>> MLOCK_RELEASE sequence at the end of each job.
>>
>> If sync point is allocated within kernel, then we'll need to always
>> patch all job's sync point increments with the ID of the allocated sync
>> point, regardless of whether firewall enabled or not.
> 
> The idea was that the job tracking increment would also be added to the
> pushbuffer in the kernel, so gathers would only have increments for the
> "user syncpoints", if any. I think this should work for THI-based
> engines (e.g. VIC), you probably have better information about
> GR2D/GR3D. On newer Tegras we could use CHANNEL/APPID protection to
> prevent the gathers from incrementing these job tracking syncpoints.

Could you please clarify what is THI?

>> Secondly, I'm now recalling that only one sync point could be assigned
>> to a channel at a time on newer Tegras which support sync point
>> protection. So it sounds like we don't really have variants other than
>> to allocate one sync point per channel for the jobs usage if we want to
>> be able to push multiple jobs into channel's pushbuffer, correct?
>>
> 
> The other way around; each syncpoint can be assigned to one channel. One
> channel may have multiple syncpoints.

Okay! So we actually could use a single sync point per-job for user
increments + job tracking, correct?

>> ...
>>>> Hmm, we actually should be able to have a one sync point per-channel
>>>> for
>>>> the job submission, similarly to what the current driver does!
>>>>
>>>> I'm keep forgetting about the waitbase existence!
>>>
>>> Tegra194 doesn't have waitbases, but if we are resubmitting all the jobs
>>> anyway, can't we just recalculate wait thresholds at that time?
>>
>> Yes, thresholds could be recalculated + job should be re-formed at the
>> push-time then.
>>
>> It also means that jobs always should be formed only at the push-time if
>> wait-command is utilized by cmdstream since the waits always need to be
>> patched because we won't know the thresholds until scheduler actually
>> runs the job.
> 
> Could we keep the job tracking syncpoints entirely within the kernel,
> and have all wait commands and other stuff that userspace does use the
> user syncpoints? Then kernel job tracking and userspace activity would
> be separate from each other.

I think this should work, but it's not clear to me what benefits this
brings to us if we could re-use the same user-allocated sync point for
both user increments + kernel job tracking.

Is the idea to protect from userspace incrementing sync point too much?
I.e. job could be treated as completed before CDMA is actually done with
it.

> Alternatively, if we know that jobs can only be removed from the middle
> of pushbuffers, and not added, we could replace any removed jobs with
> syncpoint increments in the pushbuffer and any thresholds would stay
> intact.

A bit unlikely that jobs could ever be removed from the middle of
hardware queue by the DRM scheduler.

Anyways, it should be nicer to shoot down everything and restart with a
clean slate when necessary, like it's already supposed to happen by the
scheduler.