[Mesa-dev] [PATCH] radeon/llvm: Use llvm.AMDIL.exp intrinsic again for now

Fri Nov 20 04:06:27 PST 2015

On 20.11.2015 03:22, Michel Dänzer wrote:
> On 20.11.2015 02:11, Nicolai Hähnle wrote:
>> On 19.11.2015 17:37, Tom Stellard wrote:
>>> On Thu, Nov 19, 2015 at 11:17:12AM +0100, Nicolai Hähnle wrote:
>>>> On 19.11.2015 03:55, Tom Stellard wrote:
>>>>> On Thu, Nov 19, 2015 at 11:31:55AM +0900, Michel Dänzer wrote:
>>>>>> From: Michel Dänzer <michel.daenzer at amd.com>
>>>>>>
>>>>>> llvm.exp2.f32 doesn't work in some cases yet.
>>>>>>
>>>>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92709
>>>>>> Signed-off-by: Michel Dänzer <michel.daenzer at amd.com>
>>>>>> ---
>>>>>>
>>>>>> Once the problem is fixed in the LLVM AMDGPU backend, we can re-enable
>>>>>> llvm.exp2.f32 for the fixed LLVM version.
>>>>>>
>>>>>>    src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c | 2 +-
>>>>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
>>>>>> b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
>>>>>> index ac99e73..c94f109 100644
>>>>>> --- a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
>>>>>> +++ b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
>>>>>> @@ -1539,7 +1539,7 @@ void radeon_llvm_context_init(struct
>>>>>> radeon_llvm_context * ctx)
>>>>>>        bld_base->op_actions[TGSI_OPCODE_ENDIF].emit = endif_emit;
>>>>>>        bld_base->op_actions[TGSI_OPCODE_ENDLOOP].emit = endloop_emit;
>>>>>>        bld_base->op_actions[TGSI_OPCODE_EX2].emit =
>>>>>> build_tgsi_intrinsic_nomem;
>>>>>> -    bld_base->op_actions[TGSI_OPCODE_EX2].intr_name =
>>>>>> "llvm.exp2.f32";
>>>>>> +    bld_base->op_actions[TGSI_OPCODE_EX2].intr_name =
>>>>>> "llvm.AMDIL.exp.";
>>>>>
>>>>> Do we want a native instruction here, or do we want IEEE precise exp2?
>>>>> If it's the former then we shouldn't be using llvm.exp2.f32 anyway.
>>>>>
>>>>> I know that we need to use llvm.AMDIL.exp. for older LLVM, but for
>>>>> newer
>>>>> LLVM, I would really like to start doing intrinsics the correct
>>>>> way.  In
>>>>> this case, it means adding an llvm.amdgcn.exp.f32 intrinsic to
>>>>> include/llvm/IR/IntrinsicsAMDGPU.td.  In the section with the amdgcn
>>>>> TargetPrefix.
>>>>
>>>> Doesn't AMDGPU currently emit native instructions for the various
>>>> math intrinsics anyway? If your plan is to change that and we add
>>>> our own intrinsics, what's the plan to still benefit from existing
>>>> LLVM optimization passes?
>>>>
>>>
>>> AMDGPU does emit native instruction for the math intrinsics, but this is
>>> wrong in most cases, because it assumes that a less precise
>>> calculation is
>>> acceptable when this is not always the case.
>>
>> Okay, that makes sense.
>>
>>> This a good point about optimizations.  I think the main benefits we
>>> get from
>>> using the LLVM intrinsic are constant folding and also speculative
>>> execution.
>>>
>>> An alternate solution would be to use the llvm intrinsic, but then
>>> communicate
>>> to the backend via a function attribute that it is OK to use the less
>>> precise
>>> version of exp.
>>
>> I like that alternate solution.
>>
>>>> FWIW, the original bug that Michel addresses here is caused by LLVM
>>>> libcall optimizations replacing the exp2 _intrinsic_ by a ldexp
>>>> _libcall_, which AMDGPU codegen cannot deal with. I suggested to
>>>> address this with http://reviews.llvm.org/D14327. Could you take a
>>>> look at that?
>>>>
>>>
>>> Please take a look at the comment I added to this review.
>>
>> Done :)
>
> Meanwhile, can I get an R-b for this patch to fix the immediate
> regression in Mesa Git until LLVM is fixed (and for released versions of
> LLVM)?

Right, sorry about that:

Reviewed-by: Nicolai Hähnle <nicolai.haehnle at amd.com>

>
>