[Intel-gfx] [PATCH 5/5] drm/i915: s/seqno/request/ tracking inside objects

Wed Aug 27 12:39:59 CEST 2014

On Wed, Aug 27, 2014 at 11:55:34AM +0200, Daniel Vetter wrote:
> On Tue, Aug 12, 2014 at 08:05:51PM +0100, Chris Wilson wrote:
> > At the heart of this change is that the seqno is a too low level of an
> > abstraction to handle the growing complexities of command tracking, both
> > with the introduction of multiple command queues with execbuffer and the
> > potential for reordering with a scheduler. On top of the seqno we have
> > the request. Conceptually this is just a fence, but it also has
> > substantial bookkeeping of its own in order to track the context and
> > batch in flight, for example. It is the central structure upon which we
> > can extend with dependency tracking et al.
> > 
> > As regards the objects, they were using the seqno as a simple fence,
> > upon which is check or even wait upon for command completion. This patch
> > exchanges that seqno/ring pair with the request itself. For the
> > majority, lifetime of the request is ordered by how we retire objects
> > then requests. However, both the unlocked waits and probing elsewhere do
> > not tie into the normal request lifetimes and so we need to introduce a
> > kref. Extending the objects to use the request as the fence naturally
> > extends to segregrating read/write fence tracking. This has significance
> > for it reduces the number of semaphores we need to emit, reducing the
> > likelihood of #54226, and improving performance overall.
> > 
> > v2: Rebase and split out the othogonal tweaks.
> > 
> > A silly happened with this patch. It seemed to nullify our earlier
> > seqno-vs-interrupt w/a. I could not spot why, but gen6+ started to fail
> > with missed interrupts (a good test of our robustness handling). So I
> > ripped out the existing ACTHD read and replaced it with a RING_HEAD to
> > manually check whether the request is complete. That also had the nice
> > consequence of forcing __wait_request() to being the central arbiter of
> > request completion.
> > 
> > The keener eyed reviewr will also spot that the reset_counter is moved
> > into the request simplifing __wait_request() callsites and reducing the
> > number of atomic reads by virtue of moving the check for a pending GPU
> > reset to the endpoints of GPU access.
> > 
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > Cc: Jesse Barnes <jbarnes at virtuousgeek.org>
> > Cc: Daniel Vetter <daniel.vetter at ffwll.ch>
> > Cc: Oscar Mateo <oscar.mateo at intel.com>
> > Cc: Brad Volkin <bradley.d.volkin at intel.com>
> > Cc: "Kukanova, Svetlana" <svetlana.kukanova at intel.com>
> 
> So I've tried to split this up and totally failed. Non-complete list of
> things I didn't manage to untangle:
> 
> - The mmio flip refactoring.

Yeah, that's a fairly instrumental part of the patch. It's not
complicated but it does benefit a lot from using requests to both make
the decision cleaner and the tracking correct.

> - The overlay request tracking refactoring.

Again, the api changes allow the code to be compacted.

> - The switch to multiple parallel readers with the resulting cascading
>   changes all over.

I thought that was fairly isolated to the gem object. It's glossed over
in errors/debugfs for simplicity, which deserves to be fixed given a
compact representation of all the requests, and so would kill the icky
code to find the "last" request.

> - The missed irq w/a prep changes. It's easy to split out the change to
>   re-add the rc6 reference and to ditch the ACT_HEAD read, but the commit
>   message talks about instead reading the RING_HEAD, and I just didn't
>   spot the changes relevant to that in this big diff. Was probably looking
>   in the wrong place.

I did mention that I tried that earlier on on the ml, but missed saying
that it the forcewake reference didn't unbreak the old w/a in the
changelog.

> - The move_to_active/retire refactoring. There's a pile of code movement
>   in there, but I couldn't spot really what's just refactoring and what is
>   real changed needed for the s/seqno/request/ change.

move-to() everything since that now operates on the request. retire(), not
a lot changes there, just the extra requests being tracked and the strict
lifetime ordering of the reference the object holds onto the requests.

> - De-duping some of the logical_ring_ functions. Spotted because it
>   conflicted (but was easy to hack around), still this shouldn't really be
>   part of this.

> Things I've spotted which could be split out but amount to a decent
> rewrite of the patch:
> - Getting at the ring of the last write to an object. Although I guess
>   without the multi-reader stuff and the pageflip refactoring that would
>   pretty much disappear.

Why? Who uses the ring of the last_write request? We compare engines in
pageflip but that's about it.

> - Probably similar helpers for seqno if we don't switch to parallel writes
>   in the same patch.
> 
> Splitting out the renames was easy, but that reduced the diff by less than
> 5% in size. So didn't help in reviewing the patch at all.

The actual rename patch is larger than this one (v2).
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre