[Mesa-dev] [PATCH 4/5] gallium: Add PIPE_SHADER_CAP_DOUBLES

Wed Jun 18 19:08:22 PDT 2014

Am 19.06.2014 03:14, schrieb Dave Airlie:
> On 18 June 2014 23:50, Roland Scheidegger <sroland at vmware.com> wrote:
>> Am 18.06.2014 01:54, schrieb Dave Airlie:
>>> On 18 June 2014 05:08, Roland Scheidegger <sroland at vmware.com> wrote:
>>>> This looks ok to me though since tgsi currently doesn't have any double
>>>> opcodes (well the docs have them...) it doesn't really apply to most
>>>> drivers (at least I assume you don't want to add support for it for tgsi).
>>>
>>> I've mostly forward ported the old gallium double code, and have
>>> written most of ARB_gpu_shader_fp64 on top,
>>>
>>> Though the question I did want to ask Tom is if he is just going to
>>> expose hw that has doubles, or does
>>> he plan on emulating doubles.
>>>
>>> For a lot of GLSL4.0 GPUs from AMD fglrx emulates doubles using
>>> massive magic shaders, I'm unsure
>>> if we should have a lowering pass above/below the TGSI line for these
>>> types of situations and what that
>>> would mean for this CAP.
>>
>> Oh that's interesting. I always thought drivers didn't emulate that, and
>> if apps want doubles but the device doesn't provide them it needs to do
>> that itself. For which chips does fglrx do that?
> 
> Quite a lot of the evergreen family, only CAYMAN and CYPRESS seem
> to have native FP64 support in the hw according to the version of the
> AMD shader compiler I'm using, all other VLIW4/5 chips seem to emulate
> fp64. I assume so they could advertise GL 4.0/SM5. They also expose
> the fp64 extension on rv670, rv790, rv770 and rv740 gpus.
Oh I was mistakenly thinking it's optional (that is an extension) for GL
4.0 (is is optional for SM5). That explains why they'd emulate it indeed...

> 
>> If you'd want to emulate this, the other question is if you can do it at
>> the tgsi level, or if this was exploiting some hw specific bits (well of
>> course you could still do it at tgsi level, but if the hw has some bits
>> to make this easier, then this isn't efficient). In any case I guess
>> this could be decided later.
> 
> Yeah I'm not sure where would be the best place to lower it, doing it at
> the GLSL level might be more generic, but I'm not really sure what algorithm
> fglrx uses to do it and so if it takes advantage of other hw features to help.
> 
> Dave
> 

I guess if the hw really doesn't have anything to help with a generic
translation is probably fine (though of course with vliw maybe you could
do better in the driver even if the hw doesn't have anything special for
it).

Roland