[Intel-gfx] [PATCH] drm/i915: Implement SINGLE_TIMELINE with a syncobj (v2)

Thu Mar 25 09:54:05 UTC 2021

On Thu, Mar 25, 2021 at 10:48 AM Tvrtko Ursulin
<tvrtko.ursulin at linux.intel.com> wrote:
>
>
> On 24/03/2021 17:18, Jason Ekstrand wrote:
> > On Wed, Mar 24, 2021 at 6:36 AM Tvrtko Ursulin
> > <tvrtko.ursulin at linux.intel.com> wrote:
> >>
> >>
> >> On 24/03/2021 09:52, Daniel Vetter wrote:
> >>> On Wed, Mar 24, 2021 at 09:28:58AM +0000, Tvrtko Ursulin wrote:
> >>>>
> >>>> On 23/03/2021 17:51, Jason Ekstrand wrote:
> >>>>> This API is entirely unnecessary and I'd love to get rid of it.  If
> >>>>> userspace wants a single timeline across multiple contexts, they can
> >>>>> either use implicit synchronization or a syncobj, both of which existed
> >>>>> at the time this feature landed.  The justification given at the time
> >>>>> was that it would help GL drivers which are inherently single-timeline.
> >>>>> However, neither of our GL drivers actually wanted the feature.  i965
> >>>>> was already in maintenance mode at the time and iris uses syncobj for
> >>>>> everything.
> >>>>>
> >>>>> Unfortunately, as much as I'd love to get rid of it, it is used by the
> >>>>> media driver so we can't do that.  We can, however, do the next-best
> >>>>> thing which is to embed a syncobj in the context and do exactly what
> >>>>> we'd expect from userspace internally.  This isn't an entirely identical
> >>>>> implementation because it's no longer atomic if userspace races with
> >>>>> itself by calling execbuffer2 twice simultaneously from different
> >>>>> threads.  It won't crash in that case; it just doesn't guarantee any
> >>>>> ordering between those two submits.
> >>>>>
> >>>>> Moving SINGLE_TIMELINE to a syncobj emulation has a couple of technical
> >>>>> advantages beyond mere annoyance.  One is that intel_timeline is no
> >>>>> longer an api-visible object and can remain entirely an implementation
> >>>>> detail.  This may be advantageous as we make scheduler changes going
> >>>>> forward.  Second is that, together with deleting the CLONE_CONTEXT API,
> >>>>> we should now have a 1:1 mapping between intel_context and
> >>>>> intel_timeline which may help us reduce locking.
> >>>>
> >>>> Much, much better commit message although I still fail to understand where
> >>>> do you see implementation details leaking out. So for me this is still
> >>>> something I'd like to get to the bottom of.
> >
> > I didn't say "leaking".  I said it's no longer API-visible.  That's
> > just true.  It used to be a kernel object that userspace was unaware
> > of, then we added SINGLE_TIMELINE and userspace now has some influence
> > over the object.  With this patch, it's hidden again.  I don't get why
> > that's confusing.
>
> I am definitely glad we moved on from leaking to less dramatic wording,
> but it is still not API "visible" in so much struct file_operations * is
> not user visible in any negative way when you do open(2), being just
> implementation detail. But I give up.
>
> >>>> I would also mention the difference regarding fence context change.
> >
> > There are no fence context changes.  The fence that we stuff in the
> > syncobj is an i915 fence and the fence we pull back out is an i915
> > fence.  A syncobj is just a fancy wrapper for a dma_buf pointer.
>
> Change is in the dma_fence->context.
>
> Current code single timeline is one fence context. With this patch that
> is no longer the case.
>
> Not sure that it has any practical implications for the internal
> dma_fence code like is_later checks , haven't analysed it.

We merge fewer fences at the higher levels and rely more on the fence
dependency tracking of the scheduler frontend to sort stuff out.

> And sync fence info ioctl exposes this value to userspace which probably
> has no practical implications. Unless some indirect effect when merging
> fences.

Userspace can use that to do fence merging. Which is always only a
strict optimization. I'm not even sure whether Android does that or
not.

> Main point is that I think these are the things which need to be
> discussed in the commit message.

Yeah makes sense to add these as impact, next to the "we don't deal
with races anymore" part.
-Daniel

>
> >>>> And in general I would maintain this patch as part of a series which ends up
> >>>> demonstrating the "mays" and "shoulds".
> >>>
> >>> I disagree. The past few years we've merged way too much patches and
> >>> features without carefully answering the high level questions of
> >>> - do we really need to solve this problem
> >>> - and if so, are we really solving this problem in the right place
> >>>
> >>> Now we're quite in a hole, and we're not going to get out of this hole if
> >>> we keep applying the same standards that got us here. Anything that does
> >>> not clearly and without reservation the above two questions with "yes"
> >>> needs to be removed or walled off, just so we can eventually see which
> >>> complexity we really need, and what is actually superflous.
> >>
> >> I understand your general point but when I apply it to this specific
> >> patch, even if it is simple, it is neither removing the uapi or walling
> >> it off. So I see it as the usual review standard to ask to see what
> >> "mays" and "shoulds" actually get us in concrete terms.
> >
> > Instead of focusing on the term "leak", let's focus on this line of
> > the commit message instead:
> >
> >>   Second is that, together with deleting the CLONE_CONTEXT API,
> >> we should now have a 1:1 mapping between intel_context and
> >> intel_timeline which may help us reduce locking.
> >
> > Now, I've not written any patches yet which actually reduce the
> > locking.  I can try to look into that today.  I CC'd Maarten on the
> > first send of this because I was hoping he would have good intuition
> > about this.  It may be that this object will always have to require
> > some amount of locking if the scheduler has to touch them in parallel
> > with other stuff.  What I can say concretely, however, is that this
> > does reduce the sharing of an internal object even if it doesn't get
> > rid of it completely.  The one thing that is shared all over the place
> > with this patch is a syncobj which is explicitly designed for exactly
> > this sort of case and can be used and abused by as many threads as
> > you'd like.  That seems like it's going in the right direction.
> >
> > I can further weasel-word the commit message to make the above more
> > prominent if you'd like.
>
> I am not interested in making you weasel-word anything but reaching a
> consensus and what is actually true and accurate.
>
> Regards,
>
> Tvrtko

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch