[Mesa-dev] [RFC 00/11] GL_ARB_gpu_shader_fp64

Fri Mar 3 19:41:47 UTC 2017

On Fri, Mar 3, 2017 at 2:16 PM, Jason Ekstrand <jason at jlekstrand.net> wrote:
> Hey Elie!
>
> On Fri, Mar 3, 2017 at 8:22 AM, Elie Tournier <tournier.elie at gmail.com>
> wrote:
>>
>> From: Elie Tournier <elie.tournier at collabora.com>
>>
>> This series is based on Ian's work about GL_ARB_gpu_shader_int64 [1].
>> The goal is to expose GL_ARB_shader_fp64 to OpenGL 3.0 GPUs.
>>
>> Each function can be independently tested using shader_runner from piglit.
>> The piglit files are stored on github [2].
>>
>> [1]
>> https://lists.freedesktop.org/archives/mesa-dev/2016-November/136718.html
>> [2] https://github.com/Hopetech/libSoftFloat
>
>
> Glad to see this finally turning into code.
>
> Before, we get too far into things, I'd like to talk about the approach a
> bit.  First off, if we (Intel) are going to use this on any hardware, we
> would really like it to be in NIR.  The reason for this is that NIR has a
> much more powerful algebraic optimizer than GLSL IR and we would like to
> have as few fp64 instructions as possible before we start lowering them to
> piles of integer math.  I believe Ian's plan for this was that someone would
> write a nir_builder back-end for the stand-alone compiler.  Unfortunately,
> he sort-of left that as "an exercise to the reader" and no code exists to my
> knowledge.  If we're going to write things in GLSL, we really need that NIR
> back-end.

I'm not sure what the impetus was for developing a softfloat library
(but I'm a big fan). but the current situation is that it will largely
just be useful for AMD Evergreen/Northern Islands chips, which consume
TGSI produced from GLSL. (Aside: [1].) As such, I'm not sure if a push
towards NIR is warranted -- it would cause a more convoluted path
towards the intended target.

I do agree with the larger point - the lowering should be done as late
as possible in order to enable algebraic-style optimizations. (This is
also why I've argued that optimizing in the frontend is too early - it
should be all just be done in the backend, as additional calculations
can easily make their way into the flow. I realize that's impractical
for i965 though as the backend is not SSA though, and some opts are
necessary in GLSL in order to perform the necessary validation.)

Cheers,

  -ilia

[1] There's also an effort currently underway to implement proper
accuracy fp64 rcp/rsq/sqrt for Fermi and newer chips, but that will
likely end up as library functions in codegen, esp in part because it
will make use of nvidia-specific shader opcodes. I guess this may be
useful for the NVIDIA G200 chip to be able to expose
ARB_gpu_shader_fp64 (as it only supports addition and multiplication
natively), but I doubt there's a lot of demand for that.