[Mesa-stable] [Mesa-dev] [PATCH] i965/gs: Avoid DW * DW mul

Matt Turner mattst88 at gmail.com
Thu Dec 4 17:08:21 PST 2014


On Thu, Dec 4, 2014 at 3:37 PM, Ben Widawsky
<benjamin.widawsky at intel.com> wrote:
> The GS has an interesting use for mul. It's essentially used as a fancy mov (in
> fact, I am not sure why a mov isn't used). The documentation in the function has
> a very good explanation from Paul on the mechanics.

What do you mean? The comment says

    * Therefore, we want to multiply DWORDs 0 and 4 of src0 (the x components
    * of the register for geometry shader invocations 0 and 1) by the
    * immediate value in src1, and store the result in DWORDs 3 and 4 of dst.

I don't know why you'd think a MOV would do?

I do see one of its uses is a MUL by 1u (the one in this patch), so
that could be a move, but not in general. And a MOV and a MUL are
equal in terms of execution speed, so there's no advantage to having a
second opcode.

> CHV has some quirks with regard to multiplication. While the documentation is
> somewhat unclear, I've found that demoting the src1 operand in the GS mul solves
> all the problems.

> I'd ask that any potential reviewer ignore the other instances
> of mul for now (I have more patches), and simply make sure that what this patch
> does is correct.

I'd cut this out of the commit message before you commit.

> This fixes around 2000 piglit tests on BSW.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84777 (with many dupes)
> Cc: "10.4" <mesa-stable at lists.freedesktop.org>

Are we actually expecting usable BSW support to be in 10.4.x? (The
commit message uses both CHV and BSW. Can we pick one?)

> Signed-off-by: Ben Widawsky <ben at bwidawsk.net>
>
> ---
>  src/mesa/drivers/dri/i965/brw_vec4_generator.cpp  | 5 ++++-
>  src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 3 ++-
>  2 files changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> index b353539..4f60797 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> @@ -534,8 +534,11 @@ vec4_generator::generate_gs_set_write_offset(struct brw_reg dst,
>     brw_push_insn_state(p);
>     brw_set_default_access_mode(p, BRW_ALIGN_1);
>     brw_set_default_mask_control(p, BRW_MASK_DISABLE);
> +   assert(src1.file == BRW_IMMEDIATE_VALUE &&
> +          src1.type == BRW_REGISTER_TYPE_UD &&
> +          src1.dw1.ud <= SHRT_MAX);
>     brw_MUL(p, suboffset(stride(dst, 2, 2, 1), 3), stride(src0, 8, 2, 4),
> -           src1);
> +           retype(src1, BRW_REGISTER_TYPE_UW));

I feel a little weird about this, but the alternative is setting the
type in the visitor and that doesn't seem nice either. I guess this is
probably the best thing.

One thing, you're comparing with SHRT_MAX but then setting the type to
unsigned-word. Shouldn't you compare with USHRT_MAX?

With the commit message updated and trimmed and s/SHRT_MAX/USHRT_MAX/:

Reviewed-by: Matt Turner <mattst88 at gmail.com>


More information about the mesa-stable mailing list