[Bug 89058] [SKL]Render error in some games (etqw-demo, nexuiz, portal)

Wed Mar 25 08:38:32 PDT 2015

https://bugs.freedesktop.org/show_bug.cgi?id=89058

--- Comment #27 from Neil Roberts <neil at linux.intel.com> ---
I think I have a better understanding of what's going on. The original ARBvp
program has three constant array loads with a non-constant index like this:

MOV        _R0, _joints[_A0.x+0];
MOV        _R1, _joints[_A0.x+1];
MOV        _R2, _joints[_A0.x+2];

These get converted to VS_OPCODE_PULL_CONSTANT_LOAD_GEN7 instructions which is
something like below. v17, v20 and v23 are 2-register-wide virtual registers
where the first register is reserved for the message header and the second
register is loaded with the indices by some prior instructions.

VS_OPCODE_PULL_CONSTANT_LOAD_GEN7 v18, v17
VS_OPCODE_PULL_CONSTANT_LOAD_GEN7 v21, v20
VS_OPCODE_PULL_CONSTANT_LOAD_GEN7 v24, v23

The register allocator allocates the virtual registers as below. It reuses the
source register from the second load as the destination in the third. It also
uses g11 for both the source and the dest in the first load. Neither of these
should cause a problem.

VS_OPCODE_PULL_CONSTANT_LOAD_GEN7 g11, g11
VS_OPCODE_PULL_CONSTANT_LOAD_GEN7 g12, g13
VS_OPCODE_PULL_CONSTANT_LOAD_GEN7 g13, g15

The instruction scheduler now kicks in and reorders the last two instructions.
As far as it is concerned it is safe to do this because the initialisation code
before the loads only writes to the top half of the source register pairs (g12,
g14 and g16) and doesn't write to the lower halves so it looks like writing to
those destination registers doesn't cause a collision. However the problem is
that the generator actually sneaks in a write to the source register in order
to set up the message header. So the code now looks like this:

mov(4)          g11<1>UD        g0<4,4,1>UD ; set up the message header
send(8)         g11<1>F         g11<4,4,1>.xD
                            sampler (0, 0, 7, 0) mlen 2 rlen 1
mov(4)          g15<1>UD        g0<4,4,1>UD ; set up the message header
send(8)         g13<1>F         g15<4,4,1>.xD
                            sampler (0, 0, 7, 0) mlen 2 rlen 1
mov(4)          g13<1>UD        g0<4,4,1>UD ; set up the message header
send(8)         g12<1>F         g13<4,4,1>.xD
                            sampler (0, 0, 7, 0) mlen 2 rlen 1

This is a problem because the third move instruction is actually overwriting
the results from the second send instruction. The scheduler had no way of
knowing this was going to happen because there was no dependency set up to let
it know that the PULL_CONSTANT_LOAD instruction writes to one of its sources.

I think this might be a general problem with the way we handle texture sampling
and I think it would effect normal texture sampling with a header such as
texelOffset in a fragment shader and it's just a coincidence that it is only
hit in these circumstances. However this is only a hunch because I still don't
really understand the register allocator and the scheduler very well.

Maybe a good solution would be to add the MOV for the message header outside of
the generator so that the dependencies would be tracked correctly. This might
also allow some better optimisations to take place.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/intel-3d-bugs/attachments/20150325/2014b710/attachment-0001.html>