[Mesa-dev] [PATCH v2 9/9] nv50/ir/tgsi: split mad to mul+add

Roland Scheidegger sroland at vmware.com
Tue Jun 13 00:17:33 UTC 2017


I am actually also thinking this should be different.

e.g. imho MAD means the operation can be either fused or unfused.
This is the "traditional" definition of MAD - opencl for instance will
follow this too, albeit this isn't mentioned in the gallium docs (it
probably should be).
(OpenCL says: "Whether or how the product of a * b is rounded and how
supernormal or subnormal intermediate products are handled is not
defined. mad is intended to be used where speed is preferred over
accuracy.")
I think doing something different here in gallium can only lead to
madness long term - glsl doesn't have mad in the first place, and as far
as I can tell d3d10 is ok with fused/unfused mad too (the docs stating
"Fused operations (such as mad, dp3) produce results that are no less
accurate than the worst possible serial ordering of evaluation of the
unfused expansion of the operation.")

This means that mul+add cannot be fused anywhere to a mad if precise is
specified, and therefore you should never have to worry about doing a
fused or unfused mul/add in the driver with a mad - it's enough if you
just don't fuse mul+add in the driver itself (if you can't do unfused mad).

Roland


Am 12.06.2017 um 20:19 schrieb Karol Herbst:
> fixes
> KHR-GL44.gpu_shader5.precise_qualifier
> KHR-GL45.gpu_shader5.precise_qualifier
> 
> Signed-off-by: Karol Herbst <karolherbst at gmail.com>
> ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> index c633185893..cd45e82426 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
> @@ -3184,6 +3184,20 @@ Converter::handleInstruction(const struct tgsi_full_instruction *insn)
>        break;
>     case TGSI_OPCODE_MAD:
>     case TGSI_OPCODE_UMAD:
> +      FOR_EACH_DST_ENABLED_CHANNEL(0, c, tgsi) {
> +         val0 = getSSA();
> +         src0 = fetchSrc(0, c);
> +         src1 = fetchSrc(1, c);
> +         src2 = fetchSrc(2, c);
> +         geni = mkOp2(OP_MUL, dstTy, val0, src0, src1);
> +         if (dstTy == TYPE_F32)
> +            geni->dnz = info->io.mul_zero_wins;
> +         geni->precise = insn->Instruction.Precise;
> +
> +         geni = mkOp2(OP_ADD, dstTy, dst0[c], val0, src2);
> +         geni->precise = insn->Instruction.Precise;
> +      }
> +      break;
>     case TGSI_OPCODE_SAD:
>     case TGSI_OPCODE_FMA:
>        FOR_EACH_DST_ENABLED_CHANNEL(0, c, tgsi) {
> 



More information about the mesa-dev mailing list