[Mesa-dev] [PATCH 1/2] gallium: add TGSI_OPCODE_FMA

Roland Scheidegger sroland at vmware.com
Mon Mar 2 12:57:45 PST 2015


Am 02.03.2015 um 21:27 schrieb Marek Olšák:
> On Mon, Mar 2, 2015 at 9:16 PM, Roland Scheidegger <sroland at vmware.com> wrote:
>> Am 02.03.2015 um 20:50 schrieb Ilia Mirkin:
>>> On Mon, Mar 2, 2015 at 2:44 PM, Marek Olšák <maraeo at gmail.com>
>>> wrote:
>>>> On Mon, Mar 2, 2015 at 5:09 PM, Ilia Mirkin <imirkin at alum.mit.edu>
>>>> wrote:
>>>>> Like Roland mentioned, you need to add DFMA, and the relevant
>>>>> cases to glsl_to_tgsi_visitor::get_opcode so that it can be
>>>>> selected.
>>>>
>>>> I plan to add DFMA too, but it's really low priority for me right
>>>> now.
>>>
>>> OK, well, without DFMA, I'm fairly sure that this will break fp64 if
>>> you add support for the opcode in softpipe (or I add it in nvc0).
>> I guess if double fmas there happen when translating to tgsi you could
>> always translate them to dmad still.
>>
>>
>>>> I can wait with this patch until it becomes high priority if
>>>> needed. Or feel free to take over.
>>>>
>>>> I'll just change the definition of FMA to:
>>>>
>>>> "Perform a * b + c with no intermediate rounding step." (same as
>>>> LLVM's FMA)
>>>
>>> FWIW ARB_shader_precision allows fma() to do whatever it wants.
>> That doesn't look particularly useful to me, though is indeed what the
>> spec is saying - only with precise keyword is there even a difference to
>> a*b+c and even then correct rounding is still not required (and even in
>> this case actually I guess I read that wrong, "considered a single
>> operation" probably does not imply there's not intermediate rounding).
>> So maybe we shouldn't say that no intermediate rounding is happening
>> then after all, we don't really have any means to specify precision
>> requirements for operations in tgsi in general. It is worth noting that
>> d3d10/11 does not feature any fma opcode for single floats, only doubles
>> (but only with extended double support) where indeed "correct" result is
>> required, so quite possibly not all gpus can really do the correct
>> version (or maybe can do it only with a performance hit).
>>
>> So the original cryptic wording may be ok too - probably need to revisit
>> the stuff wrt specfiying precision at some point in general.
> 
> Or we can keep the strict and simple description for TGSI and allow
> OpenGL to use FMA if it's supported, and MAD otherwise.
> 
> All DX11 Radeons support FMA.

According to my memory, for pre-SI only the parts supporting doubles do.
I guess though this means that they will indeed do fully ieee754-2008
compliant fma.

In any case, if all gpus supporting fma really do the opencl compliant
version, then sticking to the simple but strict definition looks good to
me. Maybe we'll need to revisit some things when deciding how things
like precise qualifiers need to be handled (as I'd guess this should
translate to some optimization options in the backends).

Roland



More information about the mesa-dev mailing list