[Intel-gfx] [PATCH] drm/i915: Disable full ppgtt by default
Daniel Vetter
daniel at ffwll.ch
Tue Mar 18 11:19:25 CET 2014
On Thu, Mar 06, 2014 at 09:30:01PM +0100, Daniel Vetter wrote:
> On Thu, Mar 06, 2014 at 10:17:12AM -0800, Ben Widawsky wrote:
> > On Thu, Mar 06, 2014 at 12:14:21PM +0100, Daniel Vetter wrote:
> > > There are too many oustanding issues:
> > >
> > > - Fence handling in the current code is broken. There's a patch series
> > > from me, but it's blocked on and extended review (which includes
> > > writing the testcases).
> > >
> > > - IOMMU mapping handling is broken, we need to properly refcount it -
> > > currently it gets destroyed when the first vma is unbound, so way
> > > too early.
> > >
> > > - There's a pending reset issue on snb. Since Mika's reset work and
> > > full ppgtt have been pulled in in separate branches and ended up
> > > intermittingly breaking each another it's unclear who's the exact
> > > culprit here.
> > >
> > > - We still have persistent evidince of crazy recursion bugs through
> > > vma_unbind and ppgtt_relase, e.g.
> > >
> > > https://bugs.freedesktop.org/show_bug.cgi?id=73383
> > >
> > > This issue (and a few others meanwhile resolved) have blocked our
> > > performance measuring/tuning group since 3 months.
> > >
> > > - Secure batch dispatching is broken. This is blocking Brad Volkin's
> > > command checker work since 3 months.
> > >
> > > All these issues are confirmed to only happen when full ppgtt is
> > > enabled, falling back to aliasing ppgtt resolves them. But even
> > > aliasing ppgtt itself still has a regression:
> > >
> > > - We currently unconditionally bind objects into the aliasing ppgtt,
> > > which means all priviledged objects like ringbuffers are visible to
> > > unpriviledged access again. On top of that this also breaks the
> > > command checker for aliasing ppgtt, since it can't hide the
> > > validated batch any more.
> > >
> > > Furthermore topic/full-ppgtt has never been reviewed:
> > >
> > > - Lifetime rules around vma unbinding/release are unclear, resulting
> > > into this awesome hack called ppgtt_release. Which seems to take the
> > > blame for most of the recursion fallout.
> > >
> > > - Context/ring init works different on gpu reset than anywhere else.
> > > Such differeneces have in the past always lead to really hard to
> > > track down bugs.
> > >
> > > - Aliasing ppgtt is treated in a bunch of places as a real address
> > > space, but it isn't - the real address space is always the global
> > > gtt in that case. This results in a bit a mess between contexts and
> > > ppgtt object, further complication the context/ppgtt/vma lifetime
> > > rules.
> > >
> > > - We don't have any docs describing the overall concepts introduced
> > > with full ppgtt. A short, concise overview describing vmas and some
> > > of the strange bits around them (like the unbound vmas used by
> > > execbuf, or the new binding rules) really is needed.
> > >
> > > Note that a lot of the post topic/full-ppgtt merge fallout has already
> > > been addressed, this entire list here of 10 issues really only contains
> > > the still outstanding issues.
> > >
> > > Finally the 3.15 merge window is approaching and I think we need to
> > > use the remaining time to ensure that our fallback option of using
> > > aliasing ppgtt is in solid shape. Hence I think it's time to throw the
> > > switch. While at it demote the helper from static inline status
> > > because really.
> > >
> > > Cc: Ben Widawsky <ben at bwidawsk.net>
> > > Cc: Dave Airlie <airlied at gmail.com>
> > > Signed-off-by: Daniel Vetter <daniel.vetter at ffwll.ch>
> >
> > [snip]
> >
> > I want a concise list in the commit message so it's obvious as we fix
> > things if we've achieved the goal or not. If you want to have nice prose
> > describing the reason and/or your feelings, that's fine, but please put
> > it after the concise list.
> >
> > I'll start what I want, and please fill in as needed. I believe this is
> > all 10 you mentioned.
> > * Fence handling broken: BUG #
>
> We have patches from me, and Paulo is signed up to do the review+igt
> testcase on our review board.
>
> > * IOMMU Broken: BUG #
>
> No bug report thus far. I can create one if people want, but that's more
> work than firing up my damn ivb, enabling dmar again and fixing it. Imo
> the short description above should be good enough to understand the bug.
>
> > * "Reset issue": Bug #
>
> https://bugs.freedesktop.org/show_bug.cgi?id=74100
>
> Mika is working on this afaik.
>
> > * Secure dispatch: Failing testcase:
>
> There's a Jira with full details: VIZ-3490
>
> > * Bug: https://bugs.freedesktop.org/show_bug.cgi?id=73383
>
> There's also been reports from Ville and iirc Chris also had pending
> issues, at least compared to dinq. Note that plain igt doesn't seem to be
> sufficient to provoke all these cases :(
>
> > * Documentation
>
> The above description is imo sufficient as a task description. There's
> also jira for this with more details: VIZ-3468
>
> > Then there is fuzzy stuff that you "want" which need more clarification
> > on exactly what will satisfy you.
> > * Lifetime rules: No clear requirement from you.
>
> Whomever tracks down the recursion bugs will likely have to sort his out.
> Essentially I want ppgtt_release gone, that entire function is just a
> giant hack.
>
> > * Context/ring init differences: What do you want?
>
> http://lists.freedesktop.org/archives/intel-gfx/2013-November/036482.html
>
> I really want this to be address before we start wreaking more havoc in
> this area, i.e. the patch series currently pending internally.
>
> > * Aliasing PPGTT real address treatment: What do you want?
>
> i915_gem_context_free in i195_gem_context.c has a comment:
>
> /* We refcount even the aliasing PPGTT to keep the code symmetric */
>
> That comment is a lie and _not_ refcounting the aliasing ppgtt would allow
> us to ditch a bunch of hard-to-understand (at least for me) cornercase in
> the lifetime rules. At least that was the case when I've reviewed the
> ppgtt branch 3 months ago. Given that we have issues around the lifetimes
> of these suckers it can't help to have as much clarity as possible.
>
> One really important thing you've dropped is fixing up the ppgtt binding
> rules for aliasing ppgtt. That's a regression compared to 3.14 and will
> render the cmd parser void with aliasing ppgtt. We need this asap both in
> time for the 3.15 merge and to finally unblock Brad Volkin's command
> parser.
>
> I've signed myself up for this and started working on it.
One more taks is required here which seems to have completely fallen
through the cracks:
- Rebase the drm_mm top-to-bottom allocator patches onto latest drm-next
and resubmit the i915 parts. I didn't merge this back in Dec due to
conflicts with other ongoing drm_mm work, but that has all settled now.
Cheers, Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
More information about the Intel-gfx
mailing list