[Nouveau] Tessellation shaders get MEM_OUT_OF_BOUNDS errors / missing triangles

Ilia Mirkin imirkin at alum.mit.edu
Mon May 18 13:48:48 PDT 2015


Hello,

I've been debugging a few different tessellation shader issues with
nouveau, but let's start small. I see this issue on my GK208 with high
frequency, and I *think* I've seen it once or twice on my GF108, but
it's exceedingly rare, if it does happen. I don't have a GK10x to test
on, unfortunately, but I assume it'll have the same issue as the
GK208.

The issue is this -- a bunch of triangles that should come out of the
tessellator end up black. I also see a GPC0/TPC1/MP trap:
MEM_OUT_OF_BOUNDS error produced by nouveau -- this is output in
response to a interrupt and MP trap generated by the hardware, read
out with nv_rd32(priv, TPC_UNIT(gpc, tpc, 0x648)); (see
gf100_gr_trap_mp). I assume some of the tessellation evaluation
invocations get killed, but I have no proof of this.

I also see this: TRAP ch 5 [0x003facf000 shader_runner[19044]]

I would imagine that's some floating point number ending up in the
register instead of an address, but the fp32 value of it
(1.35107421875) does not seem familiar.

Even when all the triangles show up, I still see the error on the
GK208, so I'm not sure if they're the same issue or not.

Now, here's the fun part -- this is completely non-deterministic.
Sometimes everything shows up on the GK208, other times I see holes,
in varying locations. I'm fairly sure that the actual shader code is
correct... so I'm doing something funny wrong. (And yeah, tons of
missed optimization opportunities in this code, but let's not dwell on
that.)

This is the piglit test:

http://cgit.freedesktop.org/piglit/tree/tests/spec/arb_tessellation_shader/execution/quads.shader_test

It should be noted that other piglit tests don't exhibit this error,
however they also tend to be simpler. One key difference is that they
don't change the patch size in TCS. I'm including a link to a text
file with the tessellation control and evaluation shaders (decoded
with nvdisasm which you're hopefully more familiar with), along with
the shader headers that we generate.

FTR, this is how I feed the raw shader opcode bytes into nvdisasm:

perl -ane 'foreach (@F) { print pack "I", hex($_) }' > tt; nvdisasm -b SM35 tt

(for some reason it doesn't want to read from a pipe or even a fd).

http://people.freedesktop.org/~imirkin/tess_shaders_quads.txt

My suspicion is that we're doing something wrong with the sched codes.
We have an elaborate calculator, but... perhaps not elaborate enough?
You can see it here:

http://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp#n2574

The reason I think it's an error in sched codes is due to the TRAP
memory location that I see -- could well be some "stale" value in the
register and the value from S2R or VILD doesn't make it in there in
time before the ALD reads it.

If you should like to try this yourself, you can use
https://github.com/imirkin/mesa/commits/gl4-integration-2 . This
branch is good enough to run Unigine Heaven, but still has a lot of
known shortcomings. (Both at the core and the nouveau levels.)

Any advice or suggestions for debugging this would be greatly
appreciated. And let me know if you'd like me to generate additional
info on this. For example I can supply a full command trace that can
be piped to demmt, if that's helpful.

Thanks in advance,

  -ilia


More information about the Nouveau mailing list