[Mesa-dev] [PATCH 1/2] gallium: add TGSI_OPCODE_FMA

Mon Mar 2 12:27:21 PST 2015

On Mon, Mar 2, 2015 at 9:16 PM, Roland Scheidegger <sroland at vmware.com> wrote:
> Am 02.03.2015 um 20:50 schrieb Ilia Mirkin:
>> On Mon, Mar 2, 2015 at 2:44 PM, Marek Olšák <maraeo at gmail.com>
>> wrote:
>>> On Mon, Mar 2, 2015 at 5:09 PM, Ilia Mirkin <imirkin at alum.mit.edu>
>>> wrote:
>>>> Like Roland mentioned, you need to add DFMA, and the relevant
>>>> cases to glsl_to_tgsi_visitor::get_opcode so that it can be
>>>> selected.
>>>
>>> I plan to add DFMA too, but it's really low priority for me right
>>> now.
>>
>> OK, well, without DFMA, I'm fairly sure that this will break fp64 if
>> you add support for the opcode in softpipe (or I add it in nvc0).
> I guess if double fmas there happen when translating to tgsi you could
> always translate them to dmad still.
>
>
>>> I can wait with this patch until it becomes high priority if
>>> needed. Or feel free to take over.
>>>
>>> I'll just change the definition of FMA to:
>>>
>>> "Perform a * b + c with no intermediate rounding step." (same as
>>> LLVM's FMA)
>>
>> FWIW ARB_shader_precision allows fma() to do whatever it wants.
> That doesn't look particularly useful to me, though is indeed what the
> spec is saying - only with precise keyword is there even a difference to
> a*b+c and even then correct rounding is still not required (and even in
> this case actually I guess I read that wrong, "considered a single
> operation" probably does not imply there's not intermediate rounding).
> So maybe we shouldn't say that no intermediate rounding is happening
> then after all, we don't really have any means to specify precision
> requirements for operations in tgsi in general. It is worth noting that
> d3d10/11 does not feature any fma opcode for single floats, only doubles
> (but only with extended double support) where indeed "correct" result is
> required, so quite possibly not all gpus can really do the correct
> version (or maybe can do it only with a performance hit).
>
> So the original cryptic wording may be ok too - probably need to revisit
> the stuff wrt specfiying precision at some point in general.

Or we can keep the strict and simple description for TGSI and allow
OpenGL to use FMA if it's supported, and MAD otherwise.

All DX11 Radeons support FMA.

Marek