[Mesa-dev] [PATCH] radv: emit fmuladd instead of fma to llvm.

Wed Oct 4 14:57:47 UTC 2017

If the fma has the exact flag, then we need to use the llvm.fma
intrinsic. These come from fma() calls with the precise or invariant
qualifiers in GLSL, where you basically have to fuse everything or
fuse nothing consistently, and llvm.fmuladd doesn't guarantee that.

On Tue, Oct 3, 2017 at 10:10 PM, Dave Airlie <airlied at gmail.com> wrote:
> From: Dave Airlie <airlied at redhat.com>
>
> For Vulkan SPIR-V the spec states
> fma() Inherited from OpFMul followed by OpFAdd.
>
> Matt says the backend will do the right thing depending on the
> hardware being compiled for, if you use the fmuladd intrinsic.
>
> Using the Mad Max pts test, on high settings at 4K:
> CHP: 55->60
> HGDD: 46->50
> LM: 55->60
> No change on Stronghold.
>
> Thanks to Feral for spending the time to track this down.
>
> Signed-off-by: Dave Airlie <airlied at redhat.com>
> ---
>  src/amd/common/ac_nir_to_llvm.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
> index d7b6259..11ba487 100644
> --- a/src/amd/common/ac_nir_to_llvm.c
> +++ b/src/amd/common/ac_nir_to_llvm.c
> @@ -1707,7 +1707,7 @@ static void visit_alu(struct ac_nir_context *ctx, const nir_alu_instr *instr)
>                                                       result);
>                 break;
>         case nir_op_ffma:
> -               result = emit_intrin_3f_param(&ctx->ac, "llvm.fma",
> +               result = emit_intrin_3f_param(&ctx->ac, "llvm.fmuladd",
>                                               ac_to_float_type(&ctx->ac, def_type), src[0], src[1], src[2]);
>                 break;
>         case nir_op_ibitfield_extract:
> --
> 2.9.4
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev