[Intel-gfx] [PATCH] drm/i915: Disable full ppgtt by default

Daniel Vetter daniel at ffwll.ch
Tue Mar 18 11:19:25 CET 2014


On Thu, Mar 06, 2014 at 09:30:01PM +0100, Daniel Vetter wrote:
> On Thu, Mar 06, 2014 at 10:17:12AM -0800, Ben Widawsky wrote:
> > On Thu, Mar 06, 2014 at 12:14:21PM +0100, Daniel Vetter wrote:
> > > There are too many oustanding issues:
> > > 
> > > - Fence handling in the current code is broken. There's a patch series
> > >   from me, but it's blocked on and extended review (which includes
> > >   writing the testcases).
> > > 
> > > - IOMMU mapping handling is broken, we need to properly refcount it -
> > >   currently it gets destroyed when the first vma is unbound, so way
> > >   too early.
> > > 
> > > - There's a pending reset issue on snb. Since Mika's reset work and
> > >   full ppgtt have been pulled in in separate branches and ended up
> > >   intermittingly breaking each another it's unclear who's the exact
> > >   culprit here.
> > > 
> > > - We still have persistent evidince of crazy recursion bugs through
> > >   vma_unbind and ppgtt_relase, e.g.
> > > 
> > >   https://bugs.freedesktop.org/show_bug.cgi?id=73383
> > > 
> > >   This issue (and a few others meanwhile resolved) have blocked our
> > >   performance measuring/tuning group since 3 months.
> > > 
> > > - Secure batch dispatching is broken. This is blocking Brad Volkin's
> > >   command checker work since 3 months.
> > > 
> > > All these issues are confirmed to only happen when full ppgtt is
> > > enabled, falling back to aliasing ppgtt resolves them. But even
> > > aliasing ppgtt itself still has a regression:
> > > 
> > > - We currently unconditionally bind objects into the aliasing ppgtt,
> > >   which means all priviledged objects like ringbuffers are visible to
> > >   unpriviledged access again. On top of that this also breaks the
> > >   command checker for aliasing ppgtt, since it can't hide the
> > >   validated batch any more.
> > > 
> > > Furthermore topic/full-ppgtt has never been reviewed:
> > > 
> > > - Lifetime rules around vma unbinding/release are unclear, resulting
> > >   into this awesome hack called ppgtt_release. Which seems to take the
> > >   blame for most of the recursion fallout.
> > > 
> > > - Context/ring init works different on gpu reset than anywhere else.
> > >   Such differeneces have in the past always lead to really hard to
> > >   track down bugs.
> > > 
> > > - Aliasing ppgtt is treated in a bunch of places as a real address
> > >   space, but it isn't - the real address space is always the global
> > >   gtt in that case. This results in a bit a mess between contexts and
> > >   ppgtt object, further complication the context/ppgtt/vma lifetime
> > >   rules.
> > > 
> > > - We don't have any docs describing the overall concepts introduced
> > >   with full ppgtt. A short, concise overview describing vmas and some
> > >   of the strange bits around them (like the unbound vmas used by
> > >   execbuf, or the new binding rules) really is needed.
> > > 
> > > Note that a lot of the post topic/full-ppgtt merge fallout has already
> > > been addressed, this entire list here of 10 issues really only contains
> > > the still outstanding issues.
> > > 
> > > Finally the 3.15 merge window is approaching and I think we need to
> > > use the remaining time to ensure that our fallback option of using
> > > aliasing ppgtt is in solid shape. Hence I think it's time to throw the
> > > switch. While at it demote the helper from static inline status
> > > because really.
> > > 
> > > Cc: Ben Widawsky <ben at bwidawsk.net>
> > > Cc: Dave Airlie <airlied at gmail.com>
> > > Signed-off-by: Daniel Vetter <daniel.vetter at ffwll.ch>
> > 
> > [snip]
> > 
> > I want a concise list in the commit message so it's obvious as we fix
> > things if we've achieved the goal or not. If you want to have nice prose
> > describing the reason and/or your feelings, that's fine, but please put
> > it after the concise list.
> > 
> > I'll start what I want, and please fill in as needed. I believe this is
> > all 10 you mentioned.
> > * Fence handling broken: BUG #
> 
> We have patches from me, and Paulo is signed up to do the review+igt
> testcase on our review board.
> 
> > * IOMMU Broken: BUG #
> 
> No bug report thus far. I can create one if people want, but that's more
> work than firing up my damn ivb, enabling dmar again and fixing it. Imo
> the short description above should be good enough to understand the bug.
> 
> > * "Reset issue": Bug #
> 
> https://bugs.freedesktop.org/show_bug.cgi?id=74100
> 
> Mika is working on this afaik.
> 
> > * Secure dispatch: Failing testcase: 
> 
> There's a Jira with full details: VIZ-3490
> 
> > * Bug: https://bugs.freedesktop.org/show_bug.cgi?id=73383
> 
> There's also been reports from Ville and iirc Chris also had pending
> issues, at least compared to dinq. Note that plain igt doesn't seem to be
> sufficient to provoke all these cases :(
> 
> > * Documentation
> 
> The above description is imo sufficient as a task description. There's
> also jira for this with more details: VIZ-3468
> 
> > Then there is fuzzy stuff that you "want" which need more clarification
> > on exactly what will satisfy you.
> > * Lifetime rules: No clear requirement from you.
> 
> Whomever tracks down the recursion bugs will likely have to sort his out.
> Essentially I want ppgtt_release gone, that entire function is just a
> giant hack.
> 
> > * Context/ring init differences: What do you want?
> 
> http://lists.freedesktop.org/archives/intel-gfx/2013-November/036482.html
> 
> I really want this to be address before we start wreaking more havoc in
> this area, i.e. the patch series currently pending internally.
> 
> > * Aliasing PPGTT real address treatment: What do you want?
> 
> i915_gem_context_free in i195_gem_context.c has a comment:
> 
> 	/* We refcount even the aliasing PPGTT to keep the code symmetric */
> 
> That comment is a lie and _not_ refcounting the aliasing ppgtt would allow
> us to ditch a bunch of hard-to-understand (at least for me) cornercase in
> the lifetime rules. At least that was the case when I've reviewed the
> ppgtt branch 3 months ago. Given that we have issues around the lifetimes
> of these suckers it can't help to have as much clarity as possible.
> 
> One really important thing you've dropped is fixing up the ppgtt binding
> rules for aliasing ppgtt. That's a regression compared to 3.14 and will
> render the cmd parser void with aliasing ppgtt. We need this asap both in
> time for the 3.15 merge and to finally unblock Brad Volkin's command
> parser.
> 
> I've signed myself up for this and started working on it.

One more taks is required here which seems to have completely fallen
through the cracks:

- Rebase the drm_mm top-to-bottom allocator patches onto latest drm-next
  and resubmit the i915 parts. I didn't merge this back in Dec due to
  conflicts with other ongoing drm_mm work, but that has all settled now.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch



More information about the Intel-gfx mailing list