[Mesa-dev] [PATCH 2/7] i965: Simplify gl_SampleID setup on Gen8+.
Kenneth Graunke
kenneth at whitecape.org
Tue Apr 19 23:51:09 UTC 2016
On Tuesday, April 19, 2016 11:19:54 AM PDT Matt Turner wrote:
> On Mon, Apr 18, 2016 at 11:48 PM, Kenneth Graunke <kenneth at whitecape.org>
wrote:
> > On Gen7+, the thread payload provides the sample ID - we can read it
> > in two instructions, without any elaborate calculations. We don't even
> > need a state dependency - this will properly produce zero in the
> > non-MSAA case. Unfortunately, we need the state flag anyway, so we
> > may as well continue to use it to produce a single MOV 0 instead of
> > SHR/AND.
> >
> > For some reason, the sample ID field is always zero on Gen7/7.5, so
> > we can't use this yet. However, it works fine on Gen8+. So, land the
> > code and use it where it's working, and leave a TODO for later.
> >
> > Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
> > ---
> > src/mesa/drivers/dri/i965/brw_fs.cpp | 42 ++++++++++++++++++++++++++++++
+-----
> > 1 file changed, 37 insertions(+), 5 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/
i965/brw_fs.cpp
> > index 551294d..41af35e 100644
> > --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> > @@ -1343,6 +1343,41 @@ fs_visitor::emit_sampleid_setup()
> > * rasterization is disabled, gl_SampleID will always be zero."
> > */
> > abld.MOV(*reg, brw_imm_d(0));
> > + } else if (devinfo->gen >= 8) {
> > + /* Sample ID comes in as 4-bit numbers in g1.0:
> > + *
> > + * 15:12 Slot 3 SampleID (only used in SIMD16)
> > + * 11:8 Slot 2 SampleID (only used in SIMD16)
> > + * 7:4 Slot 1 SampleID
> > + * 3:0 Slot 0 SampleID
> > + *
> > + * Each slot corresponds to four channels, so we want to replicate
each
> > + * half-byte value to 4 channels in a row:
> > + *
> > + * dst+0: .7 .6 .5 .4 .3 .2 .1 .0
> > + * 7:4 7:4 7:4 7:4 3:0 3:0 3:0 3:0
> > + *
> > + * dst+1: .7 .6 .5 .4 .3 .2 .1 .0 (if
SIMD16)
> > + * 15:12 15:12 15:12 15:12 11:8 11:8 11:8 11:8
> > + *
> > + * First, we read g1.0 with a <1,8,0>UB region, causing the first 8
> > + * channels to read the first byte (7:0), and the second group of 8
> > + * channels to read the second byte (15:8). Then, we shift right
by
> > + * a vector immediate of <4, 4, 4, 4, 0, 0, 0, 0>, moving the slot
1 / 3
> > + * values into place. Finally, we AND with 0xf to keep the low
nibble.
> > + *
> > + * shr(16) tmp<1>UW g1.0<1,8,0>UB 0x44440000:V
> > + * and(16) dst<1>UD tmp<8,8,1>UD 0xf:UD
>
> The comment doesn't match the code (and as the comment is, it wouldn't
work).
>
> - tmp is typed W below, but is UW and UD in the comment
> - g1.0 is B below, but UB in the comment
> - 0xf is W below, but UD in the comment
You're right - I changed the code, and neglected to update the comment.
I can update the comment to match the code.
> > + *
> > + * TODO: These payload bits exist on Gen7 too, but they appear to
always
> > + * be zero, so this code fails to work. We should find out
why.
> > + */
> > + fs_reg tmp(VGRF, alloc.allocate(1), BRW_REGISTER_TYPE_W);
> > +
> > + abld.SHR(tmp, fs_reg(stride(retype(brw_vec1_grf(1, 0),
> > + BRW_REGISTER_TYPE_B), 1, 8, 0)),
> > + brw_imm_v(0x44440000));
> > + abld.AND(*reg, tmp, brw_imm_w(0xf));
>
> ANDing with 0xf is only necessary if you're operating on signed types,
> which you don't need to be doing. Change the types to unsigned, and
> replace the AND with a MOV that expands the UW source to UD.
I don't understand. I don't think signedness should matter. I'm using
SHR, not ASR - so there's no sign extension. After that, "tmp" should
contain a bunch of 4-bit values. Reading it as W, the top bit should
not ever be set.
The byte being read contains:
- 7:4 Slot 1 SampleID
- 3:0 Slot 0 SampleID
Shifting right by 0x44440000 means that the first four channels contain
- 7:4 Zero
- 3:0 Slot 1 SampleID
while the second group of four channels remain untouched:
- 7:4 Slot 1 SampleID
- 3:0 Slot 0 SampleID
So for that group, I need to AND with 0xf (4 bits) to mask out the
"Slot 1 SampleID" in bits 7:4, so I just get "Slot 0" as desired.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part.
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160419/1b0b6d38/attachment.sig>
More information about the mesa-dev
mailing list