[Mesa-dev] [PATCH] i965: Unroll SIMD16 DDY_FINE on Sandybridge.
Matt Turner
mattst88 at gmail.com
Sun Apr 24 19:55:24 UTC 2016
On Tue, Mar 29, 2016 at 5:00 PM, Matt Turner <mattst88 at gmail.com> wrote:
> On Tue, Mar 29, 2016 at 1:32 AM, Kenneth Graunke <kenneth at whitecape.org> wrote:
>> I'm not sure why this is necessary, but it fixes 10 dEQP-GLES3 subtests
>> from dEQP-GLES3.functional.shaders.derivate.dfdy.texture.float_nicest.*.
>
> Okay, so we have to suspect that this is working around some related
> problem and it isn't that SIMD16 align16 instructions simply don't
> work on SNB. If this were the case we should have seen any of the
> piglit dfdy tests failing.
>
> If I had to guess, it could be that we're not accounting for
> even-register alignment, which I think applies to Sandybridge.
>
> I do notice that the add(16) instruction generated by the deqp tests
> you cite writes to an odd register, and the piglit tests write to an
> even register.
>
> If that is indeed the problem, it should apply to Gen5 (where we
> currently emit a SIMD16 align16 instruction) as well.
>
> Maybe we can hack up the deqp test to run on Gen5? If we could show
> that this code can't work in all circumstances on Gen5, we can
> simplify that condition greatly.
I tested on Sandybridge, and the tests pass sporadically. Using
INTEL_DEBUG=no16 allows them to pass always, which lends credence to
my hypothesis.
The simulator does not complain about sourcing an odd-aligned register
in an align16 add(16) instruction.
I also hacked up i965 to expose GLES3 on G45 and ran the test there.
The vec{2,3,4}_highp tests fail regardless of INTEL_DEBUG=no16.
Strangely all of the other tests pass, including the float_highp case.
The assembly generated for the passing mediump and failing highp cases
are the same. No idea what's going on.
Curro has my Ironlake, otherwise I'd test there. I suppose it's not a
problem there because the register allocator even-aligns all registers
used in SIMD16 operations.
Short of adding a register class on Sandybridge just for this case, I
think your patch is the right thing to do. With a comment added to the
block above stating that empirically Sandybridge cannot access
odd-aligned registers in compressed align16 instructions, this patch
gets my
Reviewed-by: Matt Turner <mattst88 at gmail.com>
More information about the mesa-dev
mailing list