[Mesa-dev] [PATCH] i965/gs: Avoid DW * DW mul

Ben Widawsky ben at bwidawsk.net
Thu Dec 4 20:29:33 PST 2014


On Thu, Dec 04, 2014 at 07:48:06PM -0800, Ben Widawsky wrote:
> On Thu, Dec 04, 2014 at 05:08:21PM -0800, Matt Turner wrote:
> > On Thu, Dec 4, 2014 at 3:37 PM, Ben Widawsky
> > <benjamin.widawsky at intel.com> wrote:
> > > The GS has an interesting use for mul. It's essentially used as a fancy mov (in
> > > fact, I am not sure why a mov isn't used). The documentation in the function has
> > > a very good explanation from Paul on the mechanics.
> > 
> > What do you mean? The comment says
> > 
> >     * Therefore, we want to multiply DWORDs 0 and 4 of src0 (the x components
> >     * of the register for geometry shader invocations 0 and 1) by the
> >     * immediate value in src1, and store the result in DWORDs 3 and 4 of dst.
> > 
> > I don't know why you'd think a MOV would do?
> > 
> > I do see one of its uses is a MUL by 1u (the one in this patch), so
> > that could be a move, but not in general. And a MOV and a MUL are
> > equal in terms of execution speed, so there's no advantage to having a
> > second opcode.
> > 
> 
> I'm pretty sure it's always 1u. In theory you would want a MUL for the operation
> he is trying to perform, it's just unnecessary with the current code, and I
> can't *really* see a reason to ever make it not 1u.
> 

The pedant got the better of me, even though I mentioned it on IRC. I /thought/
the value always ended up being 1u, but it does not. My bad.

> > > CHV has some quirks with regard to multiplication. While the documentation is
> > > somewhat unclear, I've found that demoting the src1 operand in the GS mul solves
> > > all the problems.
> > 
> > > I'd ask that any potential reviewer ignore the other instances
> > > of mul for now (I have more patches), and simply make sure that what this patch
> > > does is correct.
> > 
> > I'd cut this out of the commit message before you commit.
> > 
> > > This fixes around 2000 piglit tests on BSW.
> > >
> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84777 (with many dupes)
> > > Cc: "10.4" <mesa-stable at lists.freedesktop.org>
> > 
> > Are we actually expecting usable BSW support to be in 10.4.x? (The
> > commit message uses both CHV and BSW. Can we pick one?)
> > 
> 
> Why wouldn't we think it's usable? AFAIK, there is very little that wasn't
> already in 10.4. With this patch, I believe in terms of stability we're on par
> with BDW. I suppose we might want to add the depctrl patch (I have another patch
> in my queue that limits it to CHV only, since I actually found the errata now).
> 
> As for the name, mixing the two wasn't intentionali, but, things generally
> aren't great because the code calls it CHV, but the FDO bugs call in BSW. For
> that reason I prefer BSW, but I believe it defies Ken's convention.
> 
> > > Signed-off-by: Ben Widawsky <ben at bwidawsk.net>
> > >
> > > ---
> > >  src/mesa/drivers/dri/i965/brw_vec4_generator.cpp  | 5 ++++-
> > >  src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 3 ++-
> > >  2 files changed, 6 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> > > index b353539..4f60797 100644
> > > --- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> > > +++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> > > @@ -534,8 +534,11 @@ vec4_generator::generate_gs_set_write_offset(struct brw_reg dst,
> > >     brw_push_insn_state(p);
> > >     brw_set_default_access_mode(p, BRW_ALIGN_1);
> > >     brw_set_default_mask_control(p, BRW_MASK_DISABLE);
> > > +   assert(src1.file == BRW_IMMEDIATE_VALUE &&
> > > +          src1.type == BRW_REGISTER_TYPE_UD &&
> > > +          src1.dw1.ud <= SHRT_MAX);
> > >     brw_MUL(p, suboffset(stride(dst, 2, 2, 1), 3), stride(src0, 8, 2, 4),
> > > -           src1);
> > > +           retype(src1, BRW_REGISTER_TYPE_UW));
> > 
> > I feel a little weird about this, but the alternative is setting the
> > type in the visitor and that doesn't seem nice either. I guess this is
> > probably the best thing.
> > 
> > One thing, you're comparing with SHRT_MAX but then setting the type to
> > unsigned-word. Shouldn't you compare with USHRT_MAX?
> 
> Yes, thanks.
> 
> > 
> > With the commit message updated and trimmed and s/SHRT_MAX/USHRT_MAX/:
> > 
> > Reviewed-by: Matt Turner <mattst88 at gmail.com>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev


More information about the mesa-dev mailing list