[Nouveau] [Mesa-dev] [RFC 0/9] Add precise/invariant semantics to TGSI

Tue Jun 13 00:01:43 UTC 2017

Am 13.06.2017 um 01:57 schrieb Roland Scheidegger:
> This looks like the right idea to me too. It may sound a bit weird to do
> that per instruction, but d3d11 does that as well. (Some d3d versions
> just have a global flag basically forbidding or allowing any such fast
> math optimizations in the assembly, but I'm not actually sure everybody
> honors that without tesselation...)
> 
> For 1/9:
> Reviewed-by: Roland Scheidegger <sroland at vmware.com>

I forgot to mention, could you add some bits in gallium docs
(source/tgsi.rst) for this? Not sure where maybe under Modifiers or some
such.

Roland

> 
> 2/9 has a typo in the commit short log ("Instrutions").
> 
> FWIW surely on nv50 you could keep a single mad instruction for umad
> (sad maybe too?). (I'm actually wondering if the hw really can't do
> unfused float multiply+add as a single instruction but I know next to
> nothing about nvidia hw...)
> 
> Roland
> 
> Am 12.06.2017 um 12:42 schrieb Nicolai Hähnle:
>> On 11.06.2017 20:42, Karol Herbst wrote:
>>> Running Tomb Raider on Nouveau I found some flicker caused by ignoring
>>> precise
>>> modifiers on variables inside Nouveau.
>>>
>>> This series add precise/invariant handling to TGSI, which can be then
>>> used by
>>> drivers to disable certain unsafe optimisations which may otherwise alter
>>> calculations, which depend on having the same result across shaders.
>>
>> It's kind of amazing that we got this far without doing this. On the
>> radeonsi side, it's probably related to how conservative LLVM is.
>>
>> But this series is a good idea, since it might allow us to become more
>> aggressive with optimizations in radeonsi as well.
>>
>>
>>> This series fixes this bug in Tomb Raider and one CTS test for 4.4 and
>>> 4.5
>>>
>>> Note on Patch 3: I really dislike how I tell glsl_to_tgsi_visitor to
>>> apply the
>>> precise flag on instruction emited in ir_assignment->rhs->accept();
>>> but I found
>>> no other easy way to handle this. Maybe somebody of you has a better
>>> idea?
>>
>> Sent a suggestion, as well as comments on patches 4 & 5. Patches 1 & 2:
>>
>> Reviewed-by: Nicolai Hähnle <nicolai.haehnle at amd.com>
>>
>>
>>>
>>> Karol Herbst (9):
>>>    tgsi: add precise flag to tgsi_instruction
>>>    tgsi/dump: print _PRECISE modifier on Instrutions
>>>    st/glsl_to_tgsi: handle precise modifier
>>>    tgsi: populate precise
>>>    tgsi/text: parse _PRECISE modifier
>>>    nv50/ir: add precise field to Instruction
>>>    nv50/ir/tgsi: handle precise for most ALU instructions
>>>    nv50/ir: disable mul+add to mad for precise instructions
>>>    nv50/ir/tgsi: split mad to mul+add
>>>
>>>   src/gallium/auxiliary/tgsi/tgsi_build.c            |  4 +
>>>   src/gallium/auxiliary/tgsi/tgsi_dump.c             |  4 +
>>>   src/gallium/auxiliary/tgsi/tgsi_text.c             | 15 +++-
>>>   src/gallium/auxiliary/tgsi/tgsi_ureg.c             | 14 +++-
>>>   src/gallium/auxiliary/tgsi/tgsi_ureg.h             | 20 ++++-
>>>   src/gallium/auxiliary/util/u_simple_shaders.c      |  2 +-
>>>   src/gallium/drivers/nouveau/codegen/nv50_ir.h      |  1 +
>>>   .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  | 16 ++++
>>>   .../drivers/nouveau/codegen/nv50_ir_peephole.cpp   |  6 +-
>>>   src/gallium/include/pipe/p_shader_tokens.h         |  3 +-
>>>   src/gallium/state_trackers/nine/nine_shader.c      |  6 +-
>>>   src/mesa/state_tracker/st_atifs_to_tgsi.c          | 38 ++++-----
>>>   src/mesa/state_tracker/st_glsl_to_tgsi.cpp         | 92
>>> +++++++++++++++++-----
>>>   src/mesa/state_tracker/st_mesa_to_tgsi.c           |  8 +-
>>>   src/mesa/state_tracker/st_pbo.c                    |  2 +-
>>>   15 files changed, 172 insertions(+), 59 deletions(-)
>>>
>>
>>
>