[Nouveau] Tessellation shaders get MEM_OUT_OF_BOUNDS errors / missing triangles

Ilia Mirkin imirkin at alum.mit.edu
Fri May 22 14:10:04 PDT 2015


On Mon, May 18, 2015 at 4:48 PM, Ilia Mirkin <imirkin at alum.mit.edu> wrote:
> Hello,
>
> I've been debugging a few different tessellation shader issues with
> nouveau, but let's start small. I see this issue on my GK208 with high
> frequency, and I *think* I've seen it once or twice on my GF108, but
> it's exceedingly rare, if it does happen. I don't have a GK10x to test
> on, unfortunately, but I assume it'll have the same issue as the
> GK208.
>
> The issue is this -- a bunch of triangles that should come out of the
> tessellator end up black. I also see a GPC0/TPC1/MP trap:
> MEM_OUT_OF_BOUNDS error produced by nouveau -- this is output in
> response to a interrupt and MP trap generated by the hardware, read
> out with nv_rd32(priv, TPC_UNIT(gpc, tpc, 0x648)); (see
> gf100_gr_trap_mp). I assume some of the tessellation evaluation
> invocations get killed, but I have no proof of this.
>
> I also see this: TRAP ch 5 [0x003facf000 shader_runner[19044]]
>
> I would imagine that's some floating point number ending up in the
> register instead of an address, but the fp32 value of it
> (1.35107421875) does not seem familiar.

Ben pointed out that the 0x3facf000 is a channel address, not a value
from the shader. Oops. So that theory completely doesn't hold water.
Perhaps some buffer isn't big enough? This ends up using 9 output
vertices per patch, with 2 vec4's each. I've tried playing with the
per-warp stack size to no avail, but I didn't *entirely* know what I
was doing either though.

>
> Even when all the triangles show up, I still see the error on the
> GK208, so I'm not sure if they're the same issue or not.
>
> Now, here's the fun part -- this is completely non-deterministic.
> Sometimes everything shows up on the GK208, other times I see holes,
> in varying locations. I'm fairly sure that the actual shader code is
> correct... so I'm doing something funny wrong. (And yeah, tons of
> missed optimization opportunities in this code, but let's not dwell on
> that.)
>
> This is the piglit test:
>
> http://cgit.freedesktop.org/piglit/tree/tests/spec/arb_tessellation_shader/execution/quads.shader_test
>
> It should be noted that other piglit tests don't exhibit this error,
> however they also tend to be simpler. One key difference is that they
> don't change the patch size in TCS. I'm including a link to a text
> file with the tessellation control and evaluation shaders (decoded
> with nvdisasm which you're hopefully more familiar with), along with
> the shader headers that we generate.
>
> FTR, this is how I feed the raw shader opcode bytes into nvdisasm:
>
> perl -ane 'foreach (@F) { print pack "I", hex($_) }' > tt; nvdisasm -b SM35 tt
>
> (for some reason it doesn't want to read from a pipe or even a fd).
>
> http://people.freedesktop.org/~imirkin/tess_shaders_quads.txt
>
> My suspicion is that we're doing something wrong with the sched codes.
> We have an elaborate calculator, but... perhaps not elaborate enough?
> You can see it here:
>
> http://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp#n2574
>
> The reason I think it's an error in sched codes is due to the TRAP
> memory location that I see -- could well be some "stale" value in the
> register and the value from S2R or VILD doesn't make it in there in
> time before the ALD reads it.
>
> If you should like to try this yourself, you can use
> https://github.com/imirkin/mesa/commits/gl4-integration-2 . This
> branch is good enough to run Unigine Heaven, but still has a lot of
> known shortcomings. (Both at the core and the nouveau levels.)
>
> Any advice or suggestions for debugging this would be greatly
> appreciated. And let me know if you'd like me to generate additional
> info on this. For example I can supply a full command trace that can
> be piped to demmt, if that's helpful.
>
> Thanks in advance,
>
>   -ilia


More information about the Nouveau mailing list