[PATCH v5 00/21] Host1x/TegraDRM UAPI

Thu Jan 28 11:46:44 UTC 2021

On 1/28/21 12:06 AM, Dmitry Osipenko wrote:
> 28.01.2021 00:57, Mikko Perttunen пишет:
>>
>>
>> On 1/27/21 11:26 PM, Dmitry Osipenko wrote:
>>> 26.01.2021 05:45, Mikko Perttunen пишет:
>>>>> 5. The hardware state of sync points should be reset when sync point is
>>>>> requested, not when host1x driver is initialized.
>>>>
>>>> This may be doable, but I don't think it is critical for this UAPI, so
>>>> let's consider it after this series.
>>>>
>>>> The userspace should anyway not be able to assume the initial value of
>>>> the syncpoint upon allocation. The kernel should set it to some high
>>>> value to catch any issues related to wraparound.
>>>
>>> This is critical because min != max when sync point is requested.
>>
>> That I would just consider a bug, and it can be fixed. But it's
>> orthogonal to whether the value gets reset every time the syncpoint is
>> allocated.
>>
>>>
>>>> Also, this makes code more complicated since it now needs to ensure all
>>>> waits on the syncpoint have completed before freeing the syncpoint,
>>>> which can be nontrivial e.g. if the waiter is in a different virtual
>>>> machine or some other device connected via PCIe (a real usecase).
>>>
>>> It sounds to me that these VM sync points should be treated very
>>> separately from a generic sync points, don't you think so? Let's not mix
>>> them and get the generic sync points usable first.
>>>
>>
>> They are not special in any way, I'm just referring to cases where the
>> waiter (consumer) is remote. The allocator of the syncpoint (producer)
>> doesn't necessarily even need to know about it. The same concern is
>> applicable within a single VM, or single application as well. Just
>> putting out the point that this is something that needs to be taken care
>> of if we were to reset the value.
> 
> Will kernel driver know that it deals with a VM sync point? >
> Will it be possible to get a non-VM sync point explicitly?
> 
> If driver knows that it deals with a VM sync point, then we can treat it
> specially, avoiding the reset and etc.
> 

There is no distinction between a "VM syncpoint" and a "non-VM 
syncpoint". To provide an example on the issue, consider we have VM1 and 
VM2. VM1 is running some camera software that produces frames. VM2 runs 
some analysis software that consumes those frames through shared memory. 
In between there is some software that takes the postfences of the 
camera software and transmits them to the analysis software to be used 
as prefences. Only this transmitting software needs to know anything 
about multiple VMs being in use.

At any time, if we want to reset the value of the syncpoint in question, 
we must ensure that all fences waiting on that syncpoint have observed 
the fence's threshold first.

Consider an interleaving like this:

VM1 (Camera)				VM2 (Analysis)
-------------------------------------------------------
Send postfence (threshold=X)
					Recv postfence (threshold=X)
Increment syncpoint value to X
Free syncpoint
Reset syncpoint value to Y
					Wait on postfence
-------------------------------------------------------

Now depending on the relative values of X and Y, either VM2 progresses 
(correctly), or gets stuck. If we didn't reset the syncpoint, the race 
could not occur (unless VM1 managed to increment the syncpoint 2^31 
times before VM2's wait starts, which is very unrealistic).

We can remove "VM1" and "VM2" everywhere here, and just consider two 
applications in one VM, too, or two parts of one application. Within one 
VM the issue is of course easier because the driver can have knowledge 
about fences and solve the race internally, but I'd always prefer not 
having such special cases.

Now, admittedly this is probably not a huge problem unless we are 
freeing syncpoints all the time, but nevertheless something to consider.

Mikko