[Intel-gfx] [PATCH] drm/i915: Disable full ppgtt by default

Daniel Vetter daniel at ffwll.ch
Thu Mar 6 21:30:01 CET 2014


On Thu, Mar 06, 2014 at 10:17:12AM -0800, Ben Widawsky wrote:
> On Thu, Mar 06, 2014 at 12:14:21PM +0100, Daniel Vetter wrote:
> > There are too many oustanding issues:
> > 
> > - Fence handling in the current code is broken. There's a patch series
> >   from me, but it's blocked on and extended review (which includes
> >   writing the testcases).
> > 
> > - IOMMU mapping handling is broken, we need to properly refcount it -
> >   currently it gets destroyed when the first vma is unbound, so way
> >   too early.
> > 
> > - There's a pending reset issue on snb. Since Mika's reset work and
> >   full ppgtt have been pulled in in separate branches and ended up
> >   intermittingly breaking each another it's unclear who's the exact
> >   culprit here.
> > 
> > - We still have persistent evidince of crazy recursion bugs through
> >   vma_unbind and ppgtt_relase, e.g.
> > 
> >   https://bugs.freedesktop.org/show_bug.cgi?id=73383
> > 
> >   This issue (and a few others meanwhile resolved) have blocked our
> >   performance measuring/tuning group since 3 months.
> > 
> > - Secure batch dispatching is broken. This is blocking Brad Volkin's
> >   command checker work since 3 months.
> > 
> > All these issues are confirmed to only happen when full ppgtt is
> > enabled, falling back to aliasing ppgtt resolves them. But even
> > aliasing ppgtt itself still has a regression:
> > 
> > - We currently unconditionally bind objects into the aliasing ppgtt,
> >   which means all priviledged objects like ringbuffers are visible to
> >   unpriviledged access again. On top of that this also breaks the
> >   command checker for aliasing ppgtt, since it can't hide the
> >   validated batch any more.
> > 
> > Furthermore topic/full-ppgtt has never been reviewed:
> > 
> > - Lifetime rules around vma unbinding/release are unclear, resulting
> >   into this awesome hack called ppgtt_release. Which seems to take the
> >   blame for most of the recursion fallout.
> > 
> > - Context/ring init works different on gpu reset than anywhere else.
> >   Such differeneces have in the past always lead to really hard to
> >   track down bugs.
> > 
> > - Aliasing ppgtt is treated in a bunch of places as a real address
> >   space, but it isn't - the real address space is always the global
> >   gtt in that case. This results in a bit a mess between contexts and
> >   ppgtt object, further complication the context/ppgtt/vma lifetime
> >   rules.
> > 
> > - We don't have any docs describing the overall concepts introduced
> >   with full ppgtt. A short, concise overview describing vmas and some
> >   of the strange bits around them (like the unbound vmas used by
> >   execbuf, or the new binding rules) really is needed.
> > 
> > Note that a lot of the post topic/full-ppgtt merge fallout has already
> > been addressed, this entire list here of 10 issues really only contains
> > the still outstanding issues.
> > 
> > Finally the 3.15 merge window is approaching and I think we need to
> > use the remaining time to ensure that our fallback option of using
> > aliasing ppgtt is in solid shape. Hence I think it's time to throw the
> > switch. While at it demote the helper from static inline status
> > because really.
> > 
> > Cc: Ben Widawsky <ben at bwidawsk.net>
> > Cc: Dave Airlie <airlied at gmail.com>
> > Signed-off-by: Daniel Vetter <daniel.vetter at ffwll.ch>
> 
> [snip]
> 
> I want a concise list in the commit message so it's obvious as we fix
> things if we've achieved the goal or not. If you want to have nice prose
> describing the reason and/or your feelings, that's fine, but please put
> it after the concise list.
> 
> I'll start what I want, and please fill in as needed. I believe this is
> all 10 you mentioned.
> * Fence handling broken: BUG #

We have patches from me, and Paulo is signed up to do the review+igt
testcase on our review board.

> * IOMMU Broken: BUG #

No bug report thus far. I can create one if people want, but that's more
work than firing up my damn ivb, enabling dmar again and fixing it. Imo
the short description above should be good enough to understand the bug.

> * "Reset issue": Bug #

https://bugs.freedesktop.org/show_bug.cgi?id=74100

Mika is working on this afaik.

> * Secure dispatch: Failing testcase: 

There's a Jira with full details: VIZ-3490

> * Bug: https://bugs.freedesktop.org/show_bug.cgi?id=73383

There's also been reports from Ville and iirc Chris also had pending
issues, at least compared to dinq. Note that plain igt doesn't seem to be
sufficient to provoke all these cases :(

> * Documentation

The above description is imo sufficient as a task description. There's
also jira for this with more details: VIZ-3468

> Then there is fuzzy stuff that you "want" which need more clarification
> on exactly what will satisfy you.
> * Lifetime rules: No clear requirement from you.

Whomever tracks down the recursion bugs will likely have to sort his out.
Essentially I want ppgtt_release gone, that entire function is just a
giant hack.

> * Context/ring init differences: What do you want?

http://lists.freedesktop.org/archives/intel-gfx/2013-November/036482.html

I really want this to be address before we start wreaking more havoc in
this area, i.e. the patch series currently pending internally.

> * Aliasing PPGTT real address treatment: What do you want?

i915_gem_context_free in i195_gem_context.c has a comment:

	/* We refcount even the aliasing PPGTT to keep the code symmetric */

That comment is a lie and _not_ refcounting the aliasing ppgtt would allow
us to ditch a bunch of hard-to-understand (at least for me) cornercase in
the lifetime rules. At least that was the case when I've reviewed the
ppgtt branch 3 months ago. Given that we have issues around the lifetimes
of these suckers it can't help to have as much clarity as possible.

One really important thing you've dropped is fixing up the ppgtt binding
rules for aliasing ppgtt. That's a regression compared to 3.14 and will
render the cmd parser void with aliasing ppgtt. We need this asap both in
time for the 3.15 merge and to finally unblock Brad Volkin's command
parser.

I've signed myself up for this and started working on it.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch



More information about the Intel-gfx mailing list