[Nouveau] Proper gl_SampleMask output

Wed Apr 30 09:02:55 PDT 2014

Hi Ilia.  I'll take a look and see what I can find out.

Thanks,
- Andy

On Wed, Apr 23, 2014 at 05:03:17PM -0700, Ilia Mirkin wrote:
> On Wed, Apr 23, 2014 at 6:22 PM, Ilia Mirkin <imirkin at alum.mit.edu> wrote:
> > Hello,
> >
> > I've been trying to add ARB_sample_shading support to nouveau, and am
> > being defeated by the gl_SampleMask tests. Everything else works fine.
> > (And naturally the tests pass with the proprietary driver.) I'm trying
> > to do this for both GT21x, as well as GF100+.
> >
> > In the GT21x case, it seems like the low bit of method 0x1928 needs to
> > be set (as well as the second-to-lowest bit), for GF100+, the low bit
> > of the last dword of the shader header needs to be set.
> >
> > But exactly which register is the output supposed to go into? It looks
> > like with the proprietary driver, r0..r3 get the first color output,
> > and r4 gets the sample mask. However the way that things are set up
> > with nouveau, r4..r7 get the first color output (and that part works
> > fine). But where should the sample mask go at the end of the fragment
> > program? r0? r8? (I've tried all of those with minimal effect.)
> > Perhaps there's more configuration that I'm missing regarding the
> > sample mask? Also, how does this interact with the frag depth (which
> > also gets implicitly assigned based on color outputs)?
> 
> As a clarification to the r0..r3 vs r4..r7 for first color output,
> I've changed things around to ensure that the first color output ends
> up in r0..r3 in the nouveau shader too. The shader generated by
> nouveau is:
> 
> HDR[00] = 0x00021462
> HDR[04] = 0x00000000
> HDR[08] = 0x00000000
> HDR[0c] = 0x00000000
> HDR[10] = 0x00000000
> HDR[14] = 0xf0000000
> HDR[18] = 0x00000000
> HDR[1c] = 0x00000000
> HDR[20] = 0x00000000
> HDR[24] = 0x00000000
> HDR[28] = 0x00000000
> HDR[2c] = 0x00000000
> HDR[30] = 0x00000000
> HDR[34] = 0x00000000
> HDR[38] = 0x00000000
> HDR[3c] = 0x00000000
> HDR[40] = 0x00000000
> HDR[44] = 0x00000000
> HDR[48] = 0x0000000f
> HDR[4c] = 0x00000001
> shader binary code (0x80 bytes):
> 42e04237 22804280 fff01c00 c07e0070 fff05c00 c07e0074 10009de4 28004000
> 00105c00 30044000 01201c84 14060000 04001c02 10408102 05205c84 14060000
> 720042e7 22e20042 04105c02 10040404 04011c83 68000000 0000dde2 18fe0000
> 00001de2 18000000 0c005de4 28000000 00009de2 18000000 00001de7 80000000
> 
> which, with "nvdisas -b SM30 -raw" decodes to
> 
>         /*0008*/                IPA.PASS R0, a[0x70], RZ;
>         /*0010*/                IPA.PASS R1, a[0x74], RZ;
>         /*0018*/                MOV R2, c[0x0][0x4];
>         /*0020*/                FFMA R1, R1, c[0x0][0x0], R2;
>         /*0028*/                F2I.S32.F32.TRUNC R0, R0;
>         /*0030*/                IMUL32I.U32.U32 R0, R0, 0x10204081;
>         /*0038*/                F2I.S32.F32.TRUNC R1, R1;
>         /*0048*/                IMUL32I.U32.U32 R1, R1, 0x1010101;
>         /*0050*/                LOP.XOR R4, R0, R1;
>         /*0058*/                MOV32I R3, 0x3f800000;
>         /*0060*/                MOV32I R0, 0x0;
>         /*0068*/                MOV R1, R3;
>         /*0070*/                MOV32I R2, 0x0;
>         /*0078*/                EXIT ;
> 
> While the proprietary-driver-generated shader is: [the output is of
> quad-word-writes, so the right-most dword is the first of 4... so you
> have to read it right-to-left]
> 
> --816-- w 27:0x0430, 0x00000000,0x00000000,0x00000000,0x00001462
> --816-- w 27:0x0440, 0x00000000,0x00000000,0xb0000000,0x00000000
> --816-- w 27:0x0450, 0x00000000,0x00000000,0x00000000,0x00000000
> --816-- w 27:0x0460, 0x00000000,0x00000000,0x00000000,0x00000000
> --816-- w 27:0x0470, 0x00000001,0x0000000f,0x00000000,0x00000000
> --816-- w 27:0x0480, 0xc07e0074,0xfff05c00,0x22324232,0xa0423047
> --816-- w 27:0x0490, 0xc07e0070,0xfff01c00,0x2800403c,0x10009de4
> --816-- w 27:0x04a0, 0x3004803c,0x30105c40,0x2800403c,0x0000dde4
> --816-- w 27:0x04b0, 0x14860000,0x05201c84,0x3006803c,0x20009c40
> --816-- w 27:0x04c0, 0x14860000,0x09205c84,0x22004280,0x42304247
> --816-- w 27:0x04d0, 0x28000000,0xfc001de4,0x10040404,0x04009ca2
> --816-- w 27:0x04e0, 0x18fe0000,0x00005de2,0x10408102,0x0410dca2
> --816-- w 27:0x04f0, 0x28000000,0xfc009de4,0x68000000,0x08311c83
> --816-- w 27:0x0500, 0x28000000,0x0400dde4,0x20000000,0x0002e047
> --816-- w 27:0x0510, 0x4003ffff,0xe0001de7,0x80000000,0x00001de7
> --816-- w 27:0x0520, 0x40000000,0x00001de4,0x40000000,0x00001de4
> --816-- w 27:0x0530, 0x40000000,0x00001de4,0x40000000,0x00001de4
> 
> Which decodes to:
> 
>         /*0008*/                IPA.PASS R1, a[0x74], RZ;
>         /*0010*/                MOV R2, c[0x0][0xf04];
>         /*0018*/                IPA.PASS R0, a[0x70], RZ;
>         /*0020*/                MOV R3, c[0x0][0xf00];
>         /*0028*/                FFMA.FTZ R1, R1, R2, c[0x0][0xf0c];
>         /*0030*/                FFMA.FTZ R2, R0, R3, c[0x0][0xf08];
>         /*0038*/                F2I.FTZ.S32.F32.TRUNC R0, R1;
>         /*0048*/                F2I.FTZ.S32.F32.TRUNC R1, R2;
>         /*0050*/                IMUL32I R2, R0, 0x1010101;
>         /*0058*/                MOV R0, RZ;
>         /*0060*/                IMUL32I R3, R1, 0x10204081;
>         /*0068*/                MOV32I R1, 0x3f800000;
>         /*0070*/                LOP.XOR R4, R3, R2;
>         /*0078*/                MOV R2, RZ;
>         /*0088*/                MOV R3, R1;
>         /*0090*/                EXIT ;
> 
> (Not sure why the nouveau shader only has 1 FMA, but that's the input
> shader we get from Gallium. I highly doubt this is the source of the
> error, since it has nothing to do with sample masks, but my question
> about sample mask output still stands even if its :) )
> 
> Oh, and for completeness, the input GLSL shader is:
> 
> "#version 130\n"
> "#extension GL_ARB_sample_shading : enable\n"
> "out vec4 out_color;\n"
> "void main()\n"
> "{\n"
>   /* For 128x128 image size, below formula produces a bit
>    * pattern where no two bits of gl_SampleMask[0] are
>    * correlated.
>    */
> "  gl_SampleMask[0] = (int(gl_FragCoord.x) * 0x10204081) ^\n"
> "                     (int(gl_FragCoord.y) * 0x01010101);\n"
> "  out_color = vec4(0.0, 1.0, 0.0, 1.0);\n"
> "}\n";
> 
> >
> > Any insight into this would be hugely helpful. In case you feel like
> > taking a look at the actual code, these are my commits:
> > https://github.com/imirkin/mesa/commits/sample_shading . Note that
> > some bits of the sample mask were already there for nvc0 (like setting
> > the shader header bit), thus don't appear in my change.
> >
> > Thanks,
> >
> >   -ilia
> _______________________________________________
> Nouveau mailing list
> Nouveau at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/nouveau