[Mesa-dev] [PATCH] radeon/llvm: Use llvm.AMDIL.exp intrinsic again for now

Thu Nov 19 18:22:31 PST 2015

On 20.11.2015 02:11, Nicolai Hähnle wrote:
> On 19.11.2015 17:37, Tom Stellard wrote:
>> On Thu, Nov 19, 2015 at 11:17:12AM +0100, Nicolai Hähnle wrote:
>>> On 19.11.2015 03:55, Tom Stellard wrote:
>>>> On Thu, Nov 19, 2015 at 11:31:55AM +0900, Michel Dänzer wrote:
>>>>> From: Michel Dänzer <michel.daenzer at amd.com>
>>>>>
>>>>> llvm.exp2.f32 doesn't work in some cases yet.
>>>>>
>>>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92709
>>>>> Signed-off-by: Michel Dänzer <michel.daenzer at amd.com>
>>>>> ---
>>>>>
>>>>> Once the problem is fixed in the LLVM AMDGPU backend, we can re-enable
>>>>> llvm.exp2.f32 for the fixed LLVM version.
>>>>>
>>>>>   src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c | 2 +-
>>>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
>>>>> b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
>>>>> index ac99e73..c94f109 100644
>>>>> --- a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
>>>>> +++ b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
>>>>> @@ -1539,7 +1539,7 @@ void radeon_llvm_context_init(struct
>>>>> radeon_llvm_context * ctx)
>>>>>       bld_base->op_actions[TGSI_OPCODE_ENDIF].emit = endif_emit;
>>>>>       bld_base->op_actions[TGSI_OPCODE_ENDLOOP].emit = endloop_emit;
>>>>>       bld_base->op_actions[TGSI_OPCODE_EX2].emit =
>>>>> build_tgsi_intrinsic_nomem;
>>>>> -    bld_base->op_actions[TGSI_OPCODE_EX2].intr_name =
>>>>> "llvm.exp2.f32";
>>>>> +    bld_base->op_actions[TGSI_OPCODE_EX2].intr_name =
>>>>> "llvm.AMDIL.exp.";
>>>>
>>>> Do we want a native instruction here, or do we want IEEE precise exp2?
>>>> If it's the former then we shouldn't be using llvm.exp2.f32 anyway.
>>>>
>>>> I know that we need to use llvm.AMDIL.exp. for older LLVM, but for
>>>> newer
>>>> LLVM, I would really like to start doing intrinsics the correct
>>>> way.  In
>>>> this case, it means adding an llvm.amdgcn.exp.f32 intrinsic to
>>>> include/llvm/IR/IntrinsicsAMDGPU.td.  In the section with the amdgcn
>>>> TargetPrefix.
>>>
>>> Doesn't AMDGPU currently emit native instructions for the various
>>> math intrinsics anyway? If your plan is to change that and we add
>>> our own intrinsics, what's the plan to still benefit from existing
>>> LLVM optimization passes?
>>>
>>
>> AMDGPU does emit native instruction for the math intrinsics, but this is
>> wrong in most cases, because it assumes that a less precise
>> calculation is
>> acceptable when this is not always the case.
> 
> Okay, that makes sense.
> 
>> This a good point about optimizations.  I think the main benefits we
>> get from
>> using the LLVM intrinsic are constant folding and also speculative
>> execution.
>>
>> An alternate solution would be to use the llvm intrinsic, but then
>> communicate
>> to the backend via a function attribute that it is OK to use the less
>> precise
>> version of exp.
> 
> I like that alternate solution.
> 
>>> FWIW, the original bug that Michel addresses here is caused by LLVM
>>> libcall optimizations replacing the exp2 _intrinsic_ by a ldexp
>>> _libcall_, which AMDGPU codegen cannot deal with. I suggested to
>>> address this with http://reviews.llvm.org/D14327. Could you take a
>>> look at that?
>>>
>>
>> Please take a look at the comment I added to this review.
> 
> Done :)

Meanwhile, can I get an R-b for this patch to fix the immediate
regression in Mesa Git until LLVM is fixed (and for released versions of
LLVM)?

-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer