[Mesa-dev] [PATCH 5/8] i965/gs: Fix up gl_PointSize input swizzling for DUAL_INSTANCED gs.

Paul Berry stereotype441 at gmail.com
Tue Oct 22 16:21:18 CEST 2013


On 21 October 2013 18:34, Eric Anholt <eric at anholt.net> wrote:

> Paul Berry <stereotype441 at gmail.com> writes:
>
> > On 18 October 2013 17:04, Eric Anholt <eric at anholt.net> wrote:
> >> Putting these fixups for a couple of weird cases in just MOV and ADD
> >> feels wrong to me, but maybe when I understand better what's going on
> >> it'll seem more natural.
> >>
> >
> > Another possibility I'd be equally happy with would be to put the fixup
> at
> > the top of vec4_generator::generate_vec4_instruction(), before the switch
> > statement.  It would look something like this:
> >
> >    if (dst.width == BRW_WIDTH_4) {
> >       /* This happens in attribute fixups for "dual instanced" geometry
> >        * shaders, since they use attributes that are vec4's.  Since the
> exec
> >        * width is only 4, it's essential that the caller set
> >        * force_writemask_all in order to make sure the instruction is
> > executed
> >        * regardless of which channels are enabled.
> >        */
> >       assert(inst->force_writemask_all);
> >
> >       /* Fix up any <8;8,1> or <0;4,1> source registers to <4;4,1> to
> > satisfy
> >        * the following register region restrictions (from Graphics BSpec:
> >        * 3D-Media-GPGPU Engine > EU Overview > Registers and Register
> > Regions
> >        * > Register Region Restrictions)
> >        *
> >        *     1. ExecSize must be greater than or equal to Width.
> >        *
> >        *     2. If ExecSize = Width and HorzStride != 0, VertStride must
> be
> > set
> >        *        to Width * HorzStride."
> >        */
> >       for (int i = 0; i < 3; i++) {
> >          if (src[i].file == BRW_GENERAL_REGISTER_FILE)
> >             src[i] = stride(src[i], 4, 4, 1);
> >       }
> >    }
> >
> > Does that seem better to you?  I actually think I like it slightly better
> > because by making the assertion more general, I caught another case
> where I
> > think I should be setting force_writemask_all to be on the safe side (the
> > "clear r0.2" instruction in the gs prolog).
>
> I like this better -- it makes more sense to me for the fixup to be
> non-opcode-specific.  Any I think I get the problem now -- our registers
> would make a ton of sense as <4;4,1> in general (that's how I think of
> our GRFs in align16, at least!), except that we can't because then we'd
> guess an execsize of 4.


Agreed.  And it's not just how you think of our GRF's in align16.  <4;4,1>
is what we actually emit to the hardware, once the fixups at the bottom of
brw_set_src{0,1}() take effect.


>  I'm in favor of the kill-guess-execsize plan,
> even if we leave it in place for gen4/5 SF/CLIP threads (which bounce
> execsize all over the place iirc and would be a pain to convert) and
> only convert the new backends.
>

> Another option: How about instead of that assert in brw_eu_emit.c, we
> just smash the vstride to be width * hstride?  We know the vstride
> doesn't matter, because you're only using execsize components, so let's
> just not bother our brw_eu.c callers with this little problem.
>

I'm not really a fan of this--it feels like it's perpetuating the "<8;8,1>
instead of <4;4,1>" lie by introducing even more code in brw_eu_emit.c to
fix things up when we use a bogus <8;8,1> region.  Also, if we do this,
there's no convenient place to put the assert(force_writemask_all), which I
think is really valuable.

I would rather stick with the fixup at the top of
generate_vec4_instruction() for now, and revisit the decision when we get
rid of guess_execution_size().
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20131022/2b41193a/attachment.html>


More information about the mesa-dev mailing list