[Mesa-dev] [PATCH 2/2] R600: Add a pattern for fma (fused multiply add)

Fri Feb 8 10:27:02 PST 2013

Am 08.02.2013 12:12, schrieb Michel Dänzer:
> On Don, 2013-02-07 at 17:23 -0500, Tom Stellard wrote: 
>> From: Tom Stellard <thomas.stellard at amd.com>
>>
>> NOTE: This is a candidate for the Mesa stable branches
>> ---
>>  lib/Target/R600/R600Instructions.td |    7 +++++++
>>  1 files changed, 7 insertions(+), 0 deletions(-)
>>
>> diff --git a/lib/Target/R600/R600Instructions.td b/lib/Target/R600/R600Instructions.td
>> index d307ed2..07ee6f0 100644
>> --- a/lib/Target/R600/R600Instructions.td
>> +++ b/lib/Target/R600/R600Instructions.td
>> @@ -1109,6 +1109,11 @@ class TGSI_LIT_Z_Common <InstR600 mul_lit, InstR600 log_clamped, InstR600 exp_ie
>>    (exp_ieee (mul_lit (log_clamped (MAX R600_Reg32:$src_y, (f32 ZERO))), R600_Reg32:$src_w, R600_Reg32:$src_x))
>>  >;
>>  
>> +class FMAPat <InstR600 muladd> : Pat<
>> +  (fma R600_Reg32:$src0, R600_Reg32:$src1, R600_Reg32:$src2),
>> +  (muladd R600_Reg32:$src0, R600_Reg32:$src1, R600_Reg32:$src2)
>> +>;
>> +
>>  //===----------------------------------------------------------------------===//
>>  // R600 / R700 Instructions
>>  //===----------------------------------------------------------------------===//
>> @@ -1167,6 +1172,7 @@ let Predicates = [isR600] in {
>>      let Word1{31} = 1; // BARRIER
>>    }
>>    defm : SteamOutputExportPattern<R600_ExportBuf, 0x20, 0x21, 0x22, 0x23>;
>> +  def : FMAPat <MULADD_r600>;
>>  }
>>  
>>  // Helper pattern for normalizing inputs to triginomic instructions for R700+
>> @@ -1320,6 +1326,7 @@ let hasSideEffects = 1 in {
>>      let Word1{31} = 1; // BARRIER
>>    }
>>    defm : SteamOutputExportPattern<EG_ExportBuf, 0x40, 0x41, 0x42, 0x43>;
>> +  def : FMAPat <MULADD_eg>;
>>  
>>  //===----------------------------------------------------------------------===//
>>  // Memory read/write instructions
> 
> Do the MULADD instructions match the FMA semantics, i.e. they don't
> round/truncate the intermediate product but only the final sum?

muladd in llvm means "don't care" when it comes to intermediate
rounding. This is similar to opencl - if you want no rounding you need
to use fma, if you need intermediate rounding you just use mul/add, and
if you don't care you use muladd. So turning mul/add into muladd may or
may not be ok, but for muladd itself you should use whatever is fastest
(well unless you think someone turned a mul/add into a muladd in a not
quite rules-compliant fashion in which case using mul/add would be
safer...). But of course turning fma into mul/add is definitely not ok.

Roland

> 
> Patch 1 looks nice and is
> 
> Reviewed-by: Michel Dänzer <michel.daenzer at amd.com>
> 
>