[Intel-gfx] I've got the RC6 bug
eric at anholt.net
Wed Jan 18 09:51:30 PST 2012
On Wed, 18 Jan 2012 11:17:52 +0000, Chris Wilson <chris at chris-wilson.co.uk> wrote:
> On Wed, 18 Jan 2012 01:24:26 +0100, Daniel Vetter <daniel at ffwll.ch> wrote:
> > On Wed, Jan 18, 2012 at 01:16:02AM +0100, CC wrote:
> > > I attached the error state.
> > Nice one, your gpu seems to have simply disappeared. And the ringbuffer
> > contains a rather peculiar cmd sequence. Putting Chris (maybe he
> > recognizes the pattern) and Ben (he's got a patch in the works to dump a
> > debug register that might be interesting here) on cc. It's too late atm
> > for me to think about this some more.
> Not simply disappeared, someone clobbered it with an extremely large
> hammer. The GPU was killed by a stray write to address 0 which took out
> the render ring buffer and its hws page. So my first thought is a
> missing relocation, and i965g springs to mind.
At one point there was a bug in Mesa that wrote to 0:
Author: Eric Anholt <eric at anholt.net>
Date: Fri Jun 17 18:20:36 2011 -0700
i965/gen6: Use an BO instead of writing to address 0 for PIPE_CONTROL W/A.
This was spectacularly unsafe. On my system, address 0 happens to be
the hardware status page for the render ring, and the first quadword
of that happens to contain nothing we ever look at, but I sure didn't
look forward to having to debug some day when, for example, the kernel
happened to bind the ringbuffer before binding the hwsp.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 197 bytes
Desc: not available
More information about the Intel-gfx