[Mesa-dev] [PATCH 2/2] R600: Add a pattern for fma (fused multiply add)
Tom Stellard
tom at stellard.net
Fri Feb 8 06:48:56 PST 2013
On Fri, Feb 08, 2013 at 12:12:26PM +0100, Michel Dänzer wrote:
> On Don, 2013-02-07 at 17:23 -0500, Tom Stellard wrote:
> > From: Tom Stellard <thomas.stellard at amd.com>
> >
> > NOTE: This is a candidate for the Mesa stable branches
> > ---
> > lib/Target/R600/R600Instructions.td | 7 +++++++
> > 1 files changed, 7 insertions(+), 0 deletions(-)
> >
> > diff --git a/lib/Target/R600/R600Instructions.td b/lib/Target/R600/R600Instructions.td
> > index d307ed2..07ee6f0 100644
> > --- a/lib/Target/R600/R600Instructions.td
> > +++ b/lib/Target/R600/R600Instructions.td
> > @@ -1109,6 +1109,11 @@ class TGSI_LIT_Z_Common <InstR600 mul_lit, InstR600 log_clamped, InstR600 exp_ie
> > (exp_ieee (mul_lit (log_clamped (MAX R600_Reg32:$src_y, (f32 ZERO))), R600_Reg32:$src_w, R600_Reg32:$src_x))
> > >;
> >
> > +class FMAPat <InstR600 muladd> : Pat<
> > + (fma R600_Reg32:$src0, R600_Reg32:$src1, R600_Reg32:$src2),
> > + (muladd R600_Reg32:$src0, R600_Reg32:$src1, R600_Reg32:$src2)
> > +>;
> > +
> > //===----------------------------------------------------------------------===//
> > // R600 / R700 Instructions
> > //===----------------------------------------------------------------------===//
> > @@ -1167,6 +1172,7 @@ let Predicates = [isR600] in {
> > let Word1{31} = 1; // BARRIER
> > }
> > defm : SteamOutputExportPattern<R600_ExportBuf, 0x20, 0x21, 0x22, 0x23>;
> > + def : FMAPat <MULADD_r600>;
> > }
> >
> > // Helper pattern for normalizing inputs to triginomic instructions for R700+
> > @@ -1320,6 +1326,7 @@ let hasSideEffects = 1 in {
> > let Word1{31} = 1; // BARRIER
> > }
> > defm : SteamOutputExportPattern<EG_ExportBuf, 0x40, 0x41, 0x42, 0x43>;
> > + def : FMAPat <MULADD_eg>;
> >
> > //===----------------------------------------------------------------------===//
> > // Memory read/write instructions
>
> Do the MULADD instructions match the FMA semantics, i.e. they don't
> round/truncate the intermediate product but only the final sum?
>
They are different and actually the hardware has an FMA instruction,
which I didn't notice before, so I'll use that instead.
-Tom
>
> Patch 1 looks nice and is
>
> Reviewed-by: Michel Dänzer <michel.daenzer at amd.com>
>
>
> --
> Earthling Michel Dänzer | http://www.amd.com
> Libre software enthusiast | Debian, X and DRI developer
More information about the mesa-dev
mailing list