[Intel-gfx] Command parser breaks the 3D driver.

Daniel Vetter daniel at ffwll.ch
Wed Mar 26 07:13:37 CET 2014

On Wed, Mar 26, 2014 at 6:12 AM, Kenneth Graunke <kenneth at whitecape.org> wrote:
> The version of the command parser which landed in drm-intel-nightly (and
> is now enabled by default) completely breaks the 3D driver.  Running any
> program - glxgears, KDE, GNOME, whatever - results in:
>     intel_do_flush_locked failed: Invalid argument
> and then Mesa aborts the program.
> When Mesa initializes, it tries to submit several small batches to see
> if it can write various registers.  For example:
>     MI_LOAD_REGISTER_IMM | (3 - 2)
>         OACONTROL
>         0x31337000 (expected value)
>     <various pipe controls>
>         OACONTROL
>         <address>
>     MI_LOAD_REGISTER_IMM | (3 - 2)
>         OACONTROL
>         0
> We then map the buffer to see what the value is.  If it's our expected
> value, we know we can write that register, and enable features.  If not,
> we disable the functionality and never write that register again.
> This works because the hardware validator implicitly converts privileged
> commands (like MI_LOAD_REGISTER_IMM) to MI_NOOP, but otherwise accepts
> and processes the batch.  This is well-documented behavior, and we've
> been relying on it since May 2013.
> In contrast, the software validator returns -EINVAL and skips executing
> the batch.  It rejects this particular batch since OACONTROL is not in
> the kernel's register whitelist.
> I'm not sure I'm quite comfortable with the software validator
> implementing different behavior than the hardware validator.  Then
> again, it's probably better behavior...
> Also, I'm surprised to see that the software validator is always enabled
> on Haswell.  The hardware validator actually works on Haswell, and the
> majority of our batches don't need to run privileged commands, so it
> seems like we're just burning CPU pointlessly.  I thought the plan was
> to have userspace add an execbuf flag to explicitly request software
> scanning when it emits privileged commands, and (on Haswell) use the
> hardware scanner normally.

The sw validator is atm always on and returns such nasty -EINVAL so
that we can catch these bugs here ;-) I'd say everything working as
planned. Once we ship this we can reconsider the choice but if the
perf overhead is fairly benign I wouldn't mind running this always.
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

More information about the Intel-gfx mailing list