[Mesa-dev] [RFC 0/9] Add precise/invariant semantics to TGSI

Mon Jun 12 23:57:31 UTC 2017

This looks like the right idea to me too. It may sound a bit weird to do
that per instruction, but d3d11 does that as well. (Some d3d versions
just have a global flag basically forbidding or allowing any such fast
math optimizations in the assembly, but I'm not actually sure everybody
honors that without tesselation...)

For 1/9:
Reviewed-by: Roland Scheidegger <sroland at vmware.com>

2/9 has a typo in the commit short log ("Instrutions").

FWIW surely on nv50 you could keep a single mad instruction for umad
(sad maybe too?). (I'm actually wondering if the hw really can't do
unfused float multiply+add as a single instruction but I know next to
nothing about nvidia hw...)

Roland

Am 12.06.2017 um 12:42 schrieb Nicolai Hähnle:
> On 11.06.2017 20:42, Karol Herbst wrote:
>> Running Tomb Raider on Nouveau I found some flicker caused by ignoring
>> precise
>> modifiers on variables inside Nouveau.
>>
>> This series add precise/invariant handling to TGSI, which can be then
>> used by
>> drivers to disable certain unsafe optimisations which may otherwise alter
>> calculations, which depend on having the same result across shaders.
> 
> It's kind of amazing that we got this far without doing this. On the
> radeonsi side, it's probably related to how conservative LLVM is.
> 
> But this series is a good idea, since it might allow us to become more
> aggressive with optimizations in radeonsi as well.
> 
> 
>> This series fixes this bug in Tomb Raider and one CTS test for 4.4 and
>> 4.5
>>
>> Note on Patch 3: I really dislike how I tell glsl_to_tgsi_visitor to
>> apply the
>> precise flag on instruction emited in ir_assignment->rhs->accept();
>> but I found
>> no other easy way to handle this. Maybe somebody of you has a better
>> idea?
> 
> Sent a suggestion, as well as comments on patches 4 & 5. Patches 1 & 2:
> 
> Reviewed-by: Nicolai Hähnle <nicolai.haehnle at amd.com>
> 
> 
>>
>> Karol Herbst (9):
>>    tgsi: add precise flag to tgsi_instruction
>>    tgsi/dump: print _PRECISE modifier on Instrutions
>>    st/glsl_to_tgsi: handle precise modifier
>>    tgsi: populate precise
>>    tgsi/text: parse _PRECISE modifier
>>    nv50/ir: add precise field to Instruction
>>    nv50/ir/tgsi: handle precise for most ALU instructions
>>    nv50/ir: disable mul+add to mad for precise instructions
>>    nv50/ir/tgsi: split mad to mul+add
>>
>>   src/gallium/auxiliary/tgsi/tgsi_build.c            |  4 +
>>   src/gallium/auxiliary/tgsi/tgsi_dump.c             |  4 +
>>   src/gallium/auxiliary/tgsi/tgsi_text.c             | 15 +++-
>>   src/gallium/auxiliary/tgsi/tgsi_ureg.c             | 14 +++-
>>   src/gallium/auxiliary/tgsi/tgsi_ureg.h             | 20 ++++-
>>   src/gallium/auxiliary/util/u_simple_shaders.c      |  2 +-
>>   src/gallium/drivers/nouveau/codegen/nv50_ir.h      |  1 +
>>   .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  | 16 ++++
>>   .../drivers/nouveau/codegen/nv50_ir_peephole.cpp   |  6 +-
>>   src/gallium/include/pipe/p_shader_tokens.h         |  3 +-
>>   src/gallium/state_trackers/nine/nine_shader.c      |  6 +-
>>   src/mesa/state_tracker/st_atifs_to_tgsi.c          | 38 ++++-----
>>   src/mesa/state_tracker/st_glsl_to_tgsi.cpp         | 92
>> +++++++++++++++++-----
>>   src/mesa/state_tracker/st_mesa_to_tgsi.c           |  8 +-
>>   src/mesa/state_tracker/st_pbo.c                    |  2 +-
>>   15 files changed, 172 insertions(+), 59 deletions(-)
>>
> 
>