[Intel-gfx] [PATCH] drm/i915: Disable full ppgtt by default

Ben Widawsky ben at bwidawsk.net
Thu Mar 6 19:17:12 CET 2014


On Thu, Mar 06, 2014 at 12:14:21PM +0100, Daniel Vetter wrote:
> There are too many oustanding issues:
> 
> - Fence handling in the current code is broken. There's a patch series
>   from me, but it's blocked on and extended review (which includes
>   writing the testcases).
> 
> - IOMMU mapping handling is broken, we need to properly refcount it -
>   currently it gets destroyed when the first vma is unbound, so way
>   too early.
> 
> - There's a pending reset issue on snb. Since Mika's reset work and
>   full ppgtt have been pulled in in separate branches and ended up
>   intermittingly breaking each another it's unclear who's the exact
>   culprit here.
> 
> - We still have persistent evidince of crazy recursion bugs through
>   vma_unbind and ppgtt_relase, e.g.
> 
>   https://bugs.freedesktop.org/show_bug.cgi?id=73383
> 
>   This issue (and a few others meanwhile resolved) have blocked our
>   performance measuring/tuning group since 3 months.
> 
> - Secure batch dispatching is broken. This is blocking Brad Volkin's
>   command checker work since 3 months.
> 
> All these issues are confirmed to only happen when full ppgtt is
> enabled, falling back to aliasing ppgtt resolves them. But even
> aliasing ppgtt itself still has a regression:
> 
> - We currently unconditionally bind objects into the aliasing ppgtt,
>   which means all priviledged objects like ringbuffers are visible to
>   unpriviledged access again. On top of that this also breaks the
>   command checker for aliasing ppgtt, since it can't hide the
>   validated batch any more.
> 
> Furthermore topic/full-ppgtt has never been reviewed:
> 
> - Lifetime rules around vma unbinding/release are unclear, resulting
>   into this awesome hack called ppgtt_release. Which seems to take the
>   blame for most of the recursion fallout.
> 
> - Context/ring init works different on gpu reset than anywhere else.
>   Such differeneces have in the past always lead to really hard to
>   track down bugs.
> 
> - Aliasing ppgtt is treated in a bunch of places as a real address
>   space, but it isn't - the real address space is always the global
>   gtt in that case. This results in a bit a mess between contexts and
>   ppgtt object, further complication the context/ppgtt/vma lifetime
>   rules.
> 
> - We don't have any docs describing the overall concepts introduced
>   with full ppgtt. A short, concise overview describing vmas and some
>   of the strange bits around them (like the unbound vmas used by
>   execbuf, or the new binding rules) really is needed.
> 
> Note that a lot of the post topic/full-ppgtt merge fallout has already
> been addressed, this entire list here of 10 issues really only contains
> the still outstanding issues.
> 
> Finally the 3.15 merge window is approaching and I think we need to
> use the remaining time to ensure that our fallback option of using
> aliasing ppgtt is in solid shape. Hence I think it's time to throw the
> switch. While at it demote the helper from static inline status
> because really.
> 
> Cc: Ben Widawsky <ben at bwidawsk.net>
> Cc: Dave Airlie <airlied at gmail.com>
> Signed-off-by: Daniel Vetter <daniel.vetter at ffwll.ch>

[snip]

I want a concise list in the commit message so it's obvious as we fix
things if we've achieved the goal or not. If you want to have nice prose
describing the reason and/or your feelings, that's fine, but please put
it after the concise list.

I'll start what I want, and please fill in as needed. I believe this is
all 10 you mentioned.
* Fence handling broken: BUG #
* IOMMU Broken: BUG #
* "Reset issue": Bug #
* Secure dispatch: Failing testcase: 
* Bug: https://bugs.freedesktop.org/show_bug.cgi?id=73383
* Documentation

Then there is fuzzy stuff that you "want" which need more clarification
on exactly what will satisfy you.
* Lifetime rules: No clear requirement from you.
* Context/ring init differences: What do you want?
* Aliasing PPGTT real address treatment: What do you want?

In my opinion, the last 3 are things you've imposed because of your
style as maintainer, whereas the first 7 are real issues that any sane
person would require before turning on.

Anyway, if you make the concise list like I want, at the top of the
commit, and you fill in the missing details, this is:
Acked-by: Ben Widawsky <ben at bwidawsk.net>

-- 
Ben Widawsky, Intel Open Source Technology Center



More information about the Intel-gfx mailing list