[Mesa-dev] [PATCH] radeon/llvm: Use llvm.AMDIL.exp intrinsic again for now

Thu Nov 19 02:17:12 PST 2015

On 19.11.2015 03:55, Tom Stellard wrote:
> On Thu, Nov 19, 2015 at 11:31:55AM +0900, Michel Dänzer wrote:
>> From: Michel Dänzer <michel.daenzer at amd.com>
>>
>> llvm.exp2.f32 doesn't work in some cases yet.
>>
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92709
>> Signed-off-by: Michel Dänzer <michel.daenzer at amd.com>
>> ---
>>
>> Once the problem is fixed in the LLVM AMDGPU backend, we can re-enable
>> llvm.exp2.f32 for the fixed LLVM version.
>>
>>   src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
>> index ac99e73..c94f109 100644
>> --- a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
>> +++ b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
>> @@ -1539,7 +1539,7 @@ void radeon_llvm_context_init(struct radeon_llvm_context * ctx)
>>   	bld_base->op_actions[TGSI_OPCODE_ENDIF].emit = endif_emit;
>>   	bld_base->op_actions[TGSI_OPCODE_ENDLOOP].emit = endloop_emit;
>>   	bld_base->op_actions[TGSI_OPCODE_EX2].emit = build_tgsi_intrinsic_nomem;
>> -	bld_base->op_actions[TGSI_OPCODE_EX2].intr_name = "llvm.exp2.f32";
>> +	bld_base->op_actions[TGSI_OPCODE_EX2].intr_name = "llvm.AMDIL.exp.";
>
> Do we want a native instruction here, or do we want IEEE precise exp2?
> If it's the former then we shouldn't be using llvm.exp2.f32 anyway.
>
> I know that we need to use llvm.AMDIL.exp. for older LLVM, but for newer
> LLVM, I would really like to start doing intrinsics the correct way.  In
> this case, it means adding an llvm.amdgcn.exp.f32 intrinsic to
> include/llvm/IR/IntrinsicsAMDGPU.td.  In the section with the amdgcn
> TargetPrefix.

Doesn't AMDGPU currently emit native instructions for the various math 
intrinsics anyway? If your plan is to change that and we add our own 
intrinsics, what's the plan to still benefit from existing LLVM 
optimization passes?

FWIW, the original bug that Michel addresses here is caused by LLVM 
libcall optimizations replacing the exp2 _intrinsic_ by a ldexp 
_libcall_, which AMDGPU codegen cannot deal with. I suggested to address 
this with http://reviews.llvm.org/D14327. Could you take a look at that?

Cheers,
Nicolai