[Mesa-stable] [Mesa-dev] [PATCH] i965/gs: Avoid DW * DW mul
Matt Turner
mattst88 at gmail.com
Thu Dec 4 17:08:21 PST 2014
On Thu, Dec 4, 2014 at 3:37 PM, Ben Widawsky
<benjamin.widawsky at intel.com> wrote:
> The GS has an interesting use for mul. It's essentially used as a fancy mov (in
> fact, I am not sure why a mov isn't used). The documentation in the function has
> a very good explanation from Paul on the mechanics.
What do you mean? The comment says
* Therefore, we want to multiply DWORDs 0 and 4 of src0 (the x components
* of the register for geometry shader invocations 0 and 1) by the
* immediate value in src1, and store the result in DWORDs 3 and 4 of dst.
I don't know why you'd think a MOV would do?
I do see one of its uses is a MUL by 1u (the one in this patch), so
that could be a move, but not in general. And a MOV and a MUL are
equal in terms of execution speed, so there's no advantage to having a
second opcode.
> CHV has some quirks with regard to multiplication. While the documentation is
> somewhat unclear, I've found that demoting the src1 operand in the GS mul solves
> all the problems.
> I'd ask that any potential reviewer ignore the other instances
> of mul for now (I have more patches), and simply make sure that what this patch
> does is correct.
I'd cut this out of the commit message before you commit.
> This fixes around 2000 piglit tests on BSW.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84777 (with many dupes)
> Cc: "10.4" <mesa-stable at lists.freedesktop.org>
Are we actually expecting usable BSW support to be in 10.4.x? (The
commit message uses both CHV and BSW. Can we pick one?)
> Signed-off-by: Ben Widawsky <ben at bwidawsk.net>
>
> ---
> src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 5 ++++-
> src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 3 ++-
> 2 files changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> index b353539..4f60797 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> @@ -534,8 +534,11 @@ vec4_generator::generate_gs_set_write_offset(struct brw_reg dst,
> brw_push_insn_state(p);
> brw_set_default_access_mode(p, BRW_ALIGN_1);
> brw_set_default_mask_control(p, BRW_MASK_DISABLE);
> + assert(src1.file == BRW_IMMEDIATE_VALUE &&
> + src1.type == BRW_REGISTER_TYPE_UD &&
> + src1.dw1.ud <= SHRT_MAX);
> brw_MUL(p, suboffset(stride(dst, 2, 2, 1), 3), stride(src0, 8, 2, 4),
> - src1);
> + retype(src1, BRW_REGISTER_TYPE_UW));
I feel a little weird about this, but the alternative is setting the
type in the visitor and that doesn't seem nice either. I guess this is
probably the best thing.
One thing, you're comparing with SHRT_MAX but then setting the type to
unsigned-word. Shouldn't you compare with USHRT_MAX?
With the commit message updated and trimmed and s/SHRT_MAX/USHRT_MAX/:
Reviewed-by: Matt Turner <mattst88 at gmail.com>
More information about the mesa-stable
mailing list