[Intel-gfx] [PATCH 57/66] drm/i915: Disallow pin with full ppgtt

Daniel Vetter daniel at ffwll.ch
Sun Jun 30 13:06:47 CEST 2013


On Sat, Jun 29, 2013 at 11:56:53PM -0700, Ben Widawsky wrote:
> On Sat, Jun 29, 2013 at 04:34:07PM +0200, Daniel Vetter wrote:
> > On Sat, Jun 29, 2013 at 8:44 AM, Chris Wilson <chris at chris-wilson.co.uk> wrote:
> > > On Fri, Jun 28, 2013 at 10:43:30PM -0700, Ben Widawsky wrote:
> > >> On Fri, Jun 28, 2013 at 09:55:27AM +0100, Chris Wilson wrote:
> > >> > On Thu, Jun 27, 2013 at 04:30:58PM -0700, Ben Widawsky wrote:
> > >> > > Pin doesn't fit with PPGTT since the interface doesn't allow for the
> > >> > > context for which we want to pin.
> > >> >
> > >> > Nak. Pin still retains it semantics with the gtt and only applies to the
> > >> > gtt.
> > >>
> > >>
> > >> Here is the error I have on pin. I was trying to debug it previously but
> > >> got sidetracked. I thought some combo of EXEC_GTT flag and hacks would
> > >> make it work, but I never finished. Maybe you know offhand what I've
> > >> messed up, and the right way to fix it?
> > >>
> > >> gem_pin: gem_pin.c:84: exec: Assertion `gem_exec[0].offset == offset' failed.
> > >
> > > Ok, that is a condition that no longer holds with full ppgtt. Now
> > > fortunately, userspace that might depend upon that is limited to DRI1
> > > era machines (at least in the userspace I know about) and we can just
> > > update the test to understand that pinning and exec are two different
> > > address spaces.
> > >
> > > How do you handle EXEC_OBJECT_NEEDS_GTT? As that may be an acceptable
> > > w/a. Or just skip that portion of the test if PRAM_HAS_FULL_PPGTT. Soft
> > > pinning should be tested separately (so that it isn't confused with
> > > pinning to the ggtt), but that is also a viable solution to this portion
> > > of the test.
> > 
> > NEEDS_GTT is only valid where we alias the global GTT and PPGTT (i.e.
> > snb). So I think we should reject it on other platforms (or silently
> > ignore it if userspace uses it already).
> 
> What do you recommend as the resolution to the failing gem_in then?

IIrc that was a test I've asked for since Chris wanted to use pin/unpin to
work around the i830/i845 cs tlb bug in a better way, just to exercise the
basics. I think the right option would be to reject pin on gen6+ since
those platforms are kms-only and not one of the few kms platforms where we
(ab)use pinning. Also all the earlier platforms don't have any (useful) hw
ppgtt implementation. We don't need pinning on gen4/5 iirc, but still
allowing it there increases test coverage with igt/gem_pin - now that we
have a test we better make use of it.

> > Now since we've had a few funny bugs in this area already which proved
> > to be rather hard to track down I think it's time to implement the
> > relevant igt tests. The (currently only internal) i-g-t wiki has some
> > information about what exactly blows up and what I think should be
> > tested. I think it'd be a good requirement to block the real ppgtt
> > enabling (not all the vma prep patches) until that test is ready.
> > 
> 
> I'll take a look on Monday and see try to start on it. I think if the
> test case is reasonable, then blocking the merge is fair, but I have to
> see what's in store before I decide whether or not to argue.
> 
> > On that topic: Do we have other gaps in our testing, or is the current
> > igt coverage sufficient for this massive refactoring? Ben, has
> > anything blown up while you've developed these patches which was not
> > caught by i-g-t?
> 
> Other than the one bug I haven't yet tracked down which I mentioned in
> 0000-cover-letter (I've never hit it in many IGT runs), I had one
> reproducible bug which was really hard to resolve. It passed all of the
> IGT tests, and caused some weird display corruption in UXA (it was the
> screenshot I posted, if you recall). It ran fine in SNA for a while, and
> then would hang. That wasn't even full PPGTT, it was just after the
> refactor. The root cause was some screwed up unbind logic where I ended
> up with a bogus unbind offset. In retrospect, I'm not really sure why
> the issue wasn't either more sever, or less, and also why IGT didn't hit
> it. It was the kind of bug which I don't feel is worth testing.

Hm, I'm intrigued about this one, but also confused. Can you please
elaborate on what exactly blows up? "unbind offset" is a new concept in my
ears ...

> Since addresses spaces are per fd, any test which opens multiple fds,
> and executes testable batches is a good test. I'm not really sure how
> many of them we have offhand, but we have a least a few. I think piglit
> on a composited desktop is one of the best focus tests we can run.

Yeah, switching between contexts and ppgtt is probably best done with real
workloads. Worth checking though might be a very basic test with
- 2 fds (so 2 address spaces)
- a bunch of equally-sized objects
- using them in inverse order on the two fds in the first batch (or any
  other trick to make sure that the two address spaces have completely
  different bindings)
- blitting a bit of data around, alternating between the two fds

Just to have a basic check for ppgtt switching. This test can probably
derived quickly from one of the existing "copy stuff around" tests.
drmtest helpers should also be useful. Even libdrm should keep on working
if we set up two libdrm bufmgr instances.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
n



More information about the Intel-gfx mailing list