[Mesa-dev] [PATCH] radv: emit fmuladd instead of fma to llvm.
imirkin at alum.mit.edu
Wed Oct 4 17:35:26 UTC 2017
Ah OK. So llvm.fmuladd is more like llvm.fmadontcare. Wrong assumption
on my part.
On Wed, Oct 4, 2017 at 1:00 PM, Connor Abbott <cwabbott0 at gmail.com> wrote:
> No. From the LLVM langref:
> The ‘llvm.fmuladd.*‘ intrinsic functions represent multiply-add
> expressions that can be fused if the code generator determines that
> (a) the target instruction set has support for a fused operation, and
> (b) that the fused operation is more efficient than the equivalent,
> separate pair of mul and add instructions.
> The (b) part is especially important -- it says that LLVM can pick and
> choose which fmuladd intrinsics to turn into FMA instructions, or
> unfused MULADD instructions, or just a sequence of mul+add. For
> example, if many instructions call fmuladd with the first two
> arguments the same, it can break it up into a mul followed by a bunch
> of adds. That wouldn't be ok under the GLSL precise semantics
> (assuming the target would've used FMA otherwise, which I think some
> GCN cards will do).
> Also, and maybe more importantly, if an app developer explicitly asks
> for fma() with a precise modifier, it's probably not a great idea to
> then give them an unfused mul+add -- it's legal, thanks to GLSL's
> weasel-wording, but probably not what you really want, on HW which
> actually does have an FMA instruction :)
> On Wed, Oct 4, 2017 at 11:25 AM, Ilia Mirkin <imirkin at alum.mit.edu> wrote:
>> Wouldn't this guarantee that nothing is fused (and thus fine)?
>> Presumably fmuladd always does mul+add either as 1 or 2 instructions?
>> On Wed, Oct 4, 2017 at 10:57 AM, Connor Abbott <cwabbott0 at gmail.com> wrote:
>>> If the fma has the exact flag, then we need to use the llvm.fma
>>> intrinsic. These come from fma() calls with the precise or invariant
>>> qualifiers in GLSL, where you basically have to fuse everything or
>>> fuse nothing consistently, and llvm.fmuladd doesn't guarantee that.
>>> On Tue, Oct 3, 2017 at 10:10 PM, Dave Airlie <airlied at gmail.com> wrote:
>>>> From: Dave Airlie <airlied at redhat.com>
>>>> For Vulkan SPIR-V the spec states
>>>> fma() Inherited from OpFMul followed by OpFAdd.
>>>> Matt says the backend will do the right thing depending on the
>>>> hardware being compiled for, if you use the fmuladd intrinsic.
>>>> Using the Mad Max pts test, on high settings at 4K:
>>>> CHP: 55->60
>>>> HGDD: 46->50
>>>> LM: 55->60
>>>> No change on Stronghold.
>>>> Thanks to Feral for spending the time to track this down.
>>>> Signed-off-by: Dave Airlie <airlied at redhat.com>
>>>> src/amd/common/ac_nir_to_llvm.c | 2 +-
>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>> diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
>>>> index d7b6259..11ba487 100644
>>>> --- a/src/amd/common/ac_nir_to_llvm.c
>>>> +++ b/src/amd/common/ac_nir_to_llvm.c
>>>> @@ -1707,7 +1707,7 @@ static void visit_alu(struct ac_nir_context *ctx, const nir_alu_instr *instr)
>>>> case nir_op_ffma:
>>>> - result = emit_intrin_3f_param(&ctx->ac, "llvm.fma",
>>>> + result = emit_intrin_3f_param(&ctx->ac, "llvm.fmuladd",
>>>> ac_to_float_type(&ctx->ac, def_type), src, src, src);
>>>> case nir_op_ibitfield_extract:
>>>> mesa-dev mailing list
>>>> mesa-dev at lists.freedesktop.org
>>> mesa-dev mailing list
>>> mesa-dev at lists.freedesktop.org
More information about the mesa-dev