[RFC] Host1x/TegraDRM UAPI (sync points)

Tue Jul 7 11:06:45 UTC 2020

02.07.2020 15:10, Mikko Perttunen пишет:
> Ok, so we would have two kinds of syncpoints for the job; one
> for kernel job tracking; and one that userspace can
> manipulate as it wants to.
> 
> Could we handle the job tracking syncpoint completely inside the kernel,
> i.e. allocate it in kernel during job submission, and add an increment
> for it at the end of the job (with condition OP_DONE)? For MLOCKing, the
> kernel already needs to insert a SYNCPT_INCR(OP_DONE) + WAIT +
> MLOCK_RELEASE sequence at the end of each job.

If sync point is allocated within kernel, then we'll need to always
patch all job's sync point increments with the ID of the allocated sync
point, regardless of whether firewall enabled or not.

Secondly, I'm now recalling that only one sync point could be assigned
to a channel at a time on newer Tegras which support sync point
protection. So it sounds like we don't really have variants other than
to allocate one sync point per channel for the jobs usage if we want to
be able to push multiple jobs into channel's pushbuffer, correct?

...
>> Hmm, we actually should be able to have a one sync point per-channel for
>> the job submission, similarly to what the current driver does!
>>
>> I'm keep forgetting about the waitbase existence!
> 
> Tegra194 doesn't have waitbases, but if we are resubmitting all the jobs
> anyway, can't we just recalculate wait thresholds at that time?

Yes, thresholds could be recalculated + job should be re-formed at the
push-time then.

It also means that jobs always should be formed only at the push-time if
wait-command is utilized by cmdstream since the waits always need to be
patched because we won't know the thresholds until scheduler actually
runs the job.

> Maybe a more detailed sequence list or diagram of what happens during
> submission and recovery would be useful.

The textual form + code is already good enough to me. A diagram could be
nice to have, although it may take a bit too much effort to create +
maintain it. But I don't mind at all if you'd want to make one :)

...
>>> * We should be able to keep the syncpoint refcounting based on fences.
>>
>> The fence doesn't need the sync point itself, it only needs to get a
>> signal when the threshold is reached or when sync point is ceased.
>>
>> Imagine:
>>
>>    - Process A creates sync point
>>    - Process A creates dma-fence from this sync point
>>    - Process A exports dma-fence to process B
>>    - Process A dies
>>
>> What should happen to process B?
>>
>>    - Should dma-fence of the process B get a error signal when process A
>> dies?
>>    - Should process B get stuck waiting endlessly for the dma-fence?
>>
>> This is one example of why I'm proposing that fence shouldn't be coupled
>> tightly to a sync point.
> 
> As a baseline, we should consider process B to get stuck endlessly
> (until a timeout of its choosing) for the fence. In this case it is
> avoidable, but if the ID/threshold pair is exported out of the fence and
> is waited for otherwise, it is unavoidable. I.e. once the ID/threshold
> are exported out of a fence, the waiter can only see the fence being
> signaled by the threshold being reached, not by the syncpoint getting
> freed.

This is correct. If sync point's FD is exported or once sync point is
resolved from a dma-fence, then sync point will stay alive until the
last reference to the sync point is dropped. I.e. if Process A dies
*after* process B started to wait on the sync point, then it will get stuck.