[Mesa-dev] [PATCH] radeon/llvm: Use llvm.AMDIL.exp intrinsic again for now

Thu Nov 19 09:11:45 PST 2015

On 19.11.2015 17:37, Tom Stellard wrote:
> On Thu, Nov 19, 2015 at 11:17:12AM +0100, Nicolai Hähnle wrote:
>> On 19.11.2015 03:55, Tom Stellard wrote:
>>> On Thu, Nov 19, 2015 at 11:31:55AM +0900, Michel Dänzer wrote:
>>>> From: Michel Dänzer <michel.daenzer at amd.com>
>>>>
>>>> llvm.exp2.f32 doesn't work in some cases yet.
>>>>
>>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92709
>>>> Signed-off-by: Michel Dänzer <michel.daenzer at amd.com>
>>>> ---
>>>>
>>>> Once the problem is fixed in the LLVM AMDGPU backend, we can re-enable
>>>> llvm.exp2.f32 for the fixed LLVM version.
>>>>
>>>>   src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c | 2 +-
>>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
>>>> index ac99e73..c94f109 100644
>>>> --- a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
>>>> +++ b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
>>>> @@ -1539,7 +1539,7 @@ void radeon_llvm_context_init(struct radeon_llvm_context * ctx)
>>>>   	bld_base->op_actions[TGSI_OPCODE_ENDIF].emit = endif_emit;
>>>>   	bld_base->op_actions[TGSI_OPCODE_ENDLOOP].emit = endloop_emit;
>>>>   	bld_base->op_actions[TGSI_OPCODE_EX2].emit = build_tgsi_intrinsic_nomem;
>>>> -	bld_base->op_actions[TGSI_OPCODE_EX2].intr_name = "llvm.exp2.f32";
>>>> +	bld_base->op_actions[TGSI_OPCODE_EX2].intr_name = "llvm.AMDIL.exp.";
>>>
>>> Do we want a native instruction here, or do we want IEEE precise exp2?
>>> If it's the former then we shouldn't be using llvm.exp2.f32 anyway.
>>>
>>> I know that we need to use llvm.AMDIL.exp. for older LLVM, but for newer
>>> LLVM, I would really like to start doing intrinsics the correct way.  In
>>> this case, it means adding an llvm.amdgcn.exp.f32 intrinsic to
>>> include/llvm/IR/IntrinsicsAMDGPU.td.  In the section with the amdgcn
>>> TargetPrefix.
>>
>> Doesn't AMDGPU currently emit native instructions for the various
>> math intrinsics anyway? If your plan is to change that and we add
>> our own intrinsics, what's the plan to still benefit from existing
>> LLVM optimization passes?
>>
>
> AMDGPU does emit native instruction for the math intrinsics, but this is
> wrong in most cases, because it assumes that a less precise calculation is
> acceptable when this is not always the case.

Okay, that makes sense.

> This a good point about optimizations.  I think the main benefits we get from
> using the LLVM intrinsic are constant folding and also speculative execution.
>
> An alternate solution would be to use the llvm intrinsic, but then communicate
> to the backend via a function attribute that it is OK to use the less precise
> version of exp.

I like that alternate solution.

>> FWIW, the original bug that Michel addresses here is caused by LLVM
>> libcall optimizations replacing the exp2 _intrinsic_ by a ldexp
>> _libcall_, which AMDGPU codegen cannot deal with. I suggested to
>> address this with http://reviews.llvm.org/D14327. Could you take a
>> look at that?
>>
>
> Please take a look at the comment I added to this review.

Done :)

Cheers,
Nicolai

>
> -Tom
>
>> Cheers,
>> Nicolai