[Nouveau] Debugging INVALID_OPCODE / MULTIPLE_WARP_ERRORS ?

Ilia Mirkin imirkin at alum.mit.edu
Fri Dec 18 12:46:24 PST 2015


On Fri, Dec 18, 2015 at 7:55 AM, Hans de Goede <hdegoede at redhat.com> wrote:
> Hi,
>
> On 16-12-15 18:24, Ilia Mirkin wrote:
>>
>> I believe that your problem is this:
>>
>>          /*01a0*/                   LD R8, [R8];
>>             /* 0x8000000000821c85 */
>>
>> That needs to be LD.E (and your ST's need to be ST.E). You're using a
>> 32-bit gmem address, but you need to be using a 64-bit one. I believe
>> the 32-bit ones work on fermi, but afaik not on Kepler.
>
>
> I do not think that is the problem, src/gallium/tests/trivial/compute
> test_input_global() has:
>
> COMP
> DCL SV[0], THREAD_ID
> DCL TEMP[0], LOCAL
> DCL TEMP[1], LOCAL
> IMM[0] UINT32 {8, 0, 0, 0}
>   0: BGNSUB :0
>   1:   UMUL TEMP[0], SV[0], IMM[0]
>   2:   LOAD TEMP[1].xy, RES[32764], TEMP[0]
>   3:   LOAD TEMP[0].x, RES[32767], TEMP[1].yyyy
>   4:   UADD TEMP[1].x, TEMP[0], -TEMP[1]
>   5:   STORE RES[32767].x, TEMP[1].yyyy, TEMP[1]
>   6:   RET
>   7: ENDSUB
>
> Which translates to:
>
> SUB:0 ()
> BB:0 (7 instructions) - df = { }
>  -> BB:1 (cross)
>   0: rdsv u32 $r0 sv[TID:0] (8)
>   1: shl u32 $r2 $r0 0x00000003 (8)
>   2: ld u64 $r0d c0[$r2+0x0] (8)
>   3: ld u32 $r2 g[$r1+0x0] (8)
>   4: add u32 $r0 $r2 neg $r0 (8)
>   5: st u32 # g[$r1+0x0] $r0 (8)
>   6: ret (8)
> BB:1 (0 instructions) - idom = BB:0, df = { }
>
> MAIN:-1 ()
> BB:0 (0 instructions) - df = { }
>
> Which is also using 32 bits loads from global memory
> and that works fine on my GK107 [GeForce GT 740].
>
> I think that for now I'll just focus on translating
> the tests from rc/gallium/tests/trivial/compute.c to
> opencl and getting the entire opencl -> llvm -> tgsi ->
> nouveau_compiler -> hardware chain to work that way.
>
> Still would be good to get nbody.c to work though.

Hmmmm odd. Not sure how 32-bit addresses work there. (Or on Fermi
tbh.) Probably assumes that the upper 8 bits of the 40-bit VA are 0?
Anyways, another thing I remember is that I couldn't get barriers to
work at all with tess (with iirc, invalid opcode errors). My solution
to the problem was to just discard them, since that's what the blob
seemed to do, and I assumed they knew what they were doing.

Perhaps I was just emitting it wrong. I'd take a careful look at how
the blob emits that BAR.SYNC primitive.

Cheers,

  -ilia


More information about the Nouveau mailing list