[PATCH v5 00/21] Host1x sync point UAPI should not be used for tracking DRM jobs

Tue Mar 23 19:57:40 UTC 2021

23.03.2021 21:21, Thierry Reding пишет:
> On Sat, Feb 27, 2021 at 02:19:39PM +0300, Dmitry Osipenko wrote:
>> 03.02.2021 14:18, Mikko Perttunen пишет:
>> ...
>>>> I'll need more time to think about it.
>>>>
>>>
>>> How about something like this:
>>>
>>> Turn the syncpt_incr field back into an array of structs like
>>>
>>> #define DRM_TEGRA_SUBMIT_SYNCPT_INCR_REPLACE_SYNCOBJ        (1<<0)
>>> #define DRM_TEGRA_SUBMIT_SYNCPT_INCR_PATCH_DYNAMIC_SYNCPT    (1<<1)
>>>
>>> struct drm_tegra_submit_syncpt_incr {
>>>     /* can be left as zero if using dynamic syncpt */
>>>     __u32 syncpt_id;
>>>     __u32 flags;
>>>
>>>     struct {
>>>         __u32 syncobj;
>>>         __u32 value;
>>>     } fence;
>>>
>>>     /* patch word as such:
>>>          * *word = *word | (syncpt_id << shift)
>>>          */
>>>     struct {
>>>         __u32 gather_offset_words;
>>>         __u32 shift;
>>>     } patch;
>>> };
>>>
>>> So this will work similarly to the buffer reloc system; the kernel
>>> driver will allocate a job syncpoint and patch in the syncpoint ID if
>>> requested, and allows outputting syncobjs for each increment.
>>
>> I haven't got any great ideas so far, but it feels that will be easier
>> and cleaner if we could have separate job paths (and job IOCTLS) based
>> on hardware generation since the workloads a too different. The needs of
>> a newer h/w are too obscure for me and absence of userspace code,
>> firmware sources and full h/w documentation do not help.
>>
>> There still should be quite a lot to share, but things like
>> mapping-to-channel and VM sync points are too far away from older h/w,
>> IMO. This means that code path before drm-sched and path for job-timeout
>> handling should be separate.
>>
>> Maybe later on it will become cleaner that we actually could unify it
>> all nicely, but for now it doesn't look like a good idea to me.
> 
> Sorry for jumping in rather randomly here and elsewhere, but it's been a
> long time since the discussion and I just want to share my thoughts on a
> couple of topics in order to hopefully help move this forward somehow.
> 
> For UAPI, "unifying it later" doesn't really work.

Of course I meant a "later version of this series" :) Sorry for not
making it clear.

> So I think the only
> realistic option is to make a best attempt at getting the UABI right so
> that it works for all existing use-cases and ideally perhaps even as of
> yet unknown use-cases in the future. As with all APIs this means that
> there's going to be a need to abstract away some of the hardware details
> so that we don't have to deal with too many low-level details in
> userspace, because otherwise the UAPI is just going to be a useless
> mess.
> 
> I think a proposal such as the above to allow both implicit and explicit
> syncpoints makes sense. For what it's worth, it's fairly similar to what
> we had come up with last time we tried destaging the ABI, although back
> at the time I'm not sure we had even considered explicit syncpoint usage
> yet. I think where reasonably possible this kind of optional behaviour
> is acceptable, but I don't think having two completely separate paths is
> going to help in any way. If anything it's just going to make it more
> difficult to maintain the code and get it to a usable state in the first
> place.
> 
> Like I said elsewhere, the programming model for host1x hasn't changed
> since Tegra20. It's rather evolved and gained a couple more features,
> but that doesn't change anything about how userspace uses it.

Not having a complete control over sync points state is a radical
change, IMO. I prefer not to use this legacy and error-prone way of sync
points handling at least for older h/w where it's possible to do. This
is what downstream driver did 10 years ago and what it still continues
to do. I was very happy to move away from this unnecessary complication
in the experimental grate-kernel driver and I think this will be great
to do in the mainline as well.

The need to map buffers explicitly is also a big difference. The need to
map BO for each channel is a quite big over-complication as we already
found out in the current version of UAPI.

Alright, perhaps the mapping could improved for older userspace if we
will move away from a per-channel contexts to a single DRM context like
I already was suggesting to Mikko before. I.e. instead of mapping BO "to
a channel", we will need to map BO "to h/w clients" within the DRM
context. This should allow older userspace to create a single mapping
for all channels/clients using a single IOCTL and then to have a single
"mapping handle" to care about. Objections?