[Mesa-dev] [PATCH 2/2] radeonsi: Allow dumping LLVM IR before optimization passes

Fri Feb 5 14:17:00 UTC 2016

On Fri, Feb 5, 2016 at 2:55 PM, Nicolai Hähnle <nhaehnle at gmail.com> wrote:
> On 04.02.2016 13:52, Tom Stellard wrote:
>>
>> On Thu, Feb 04, 2016 at 09:15:26AM +0100, Nicolai Hähnle wrote:
>>>
>>> From: Nicolai Hähnle <nicolai.haehnle at amd.com>
>>>
>>> Set R600_DEBUG=preoptir to dump the LLVM IR before optimization passes,
>>> to allow diagnosing problems caused by optimization passes.
>>>
>>> Note that in order to compile the resulting IR with llc, you will first
>>> have to run at least the mem2reg pass, e.g.
>>>
>>> opt -mem2reg -S < shader.ll | llc -march=amdgcn -mcpu=bonaire
>>>
>>> Signed-off-by: Michel Dänzer <michel.daenzer at amd.com> (original patch)
>>> Signed-off-by: Nicolai Hähnle <nicolai.haehnle at amd.com> (w/ debug flag)
>>> ---
>>> Having the option is a good idea, but I prefer to have a separate debug
>>> flag for it so that when you try to analyze bugs in codegen (which in
>>> my experience happens more often) you don't have to worry about
>>> replicating the exact same sequence of optimizations manually via the
>>> command line to reproduce the problem there.
>>>
>>>   src/gallium/drivers/radeon/r600_pipe_common.c |  1 +
>>>   src/gallium/drivers/radeon/r600_pipe_common.h |  1 +
>>>   src/gallium/drivers/radeonsi/si_shader.c      | 16 ++++++++++++++--
>>>   3 files changed, 16 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c
>>> b/src/gallium/drivers/radeon/r600_pipe_common.c
>>> index c827dbd..a1432ed 100644
>>> --- a/src/gallium/drivers/radeon/r600_pipe_common.c
>>> +++ b/src/gallium/drivers/radeon/r600_pipe_common.c
>>> @@ -393,6 +393,7 @@ static const struct debug_named_value
>>> common_debug_options[] = {
>>>         { "noir", DBG_NO_IR, "Don't print the LLVM IR"},
>>>         { "notgsi", DBG_NO_TGSI, "Don't print the TGSI"},
>>>         { "noasm", DBG_NO_ASM, "Don't print disassembled shaders"},
>>> +       { "preoptir", DBG_PREOPT_IR, "Print the LLVM IR before initial
>>> optimizations" },
>>>
>>>         /* features */
>>>         { "nodma", DBG_NO_ASYNC_DMA, "Disable asynchronous DMA" },
>>> diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h
>>> b/src/gallium/drivers/radeon/r600_pipe_common.h
>>> index c7e4c44..4e36631 100644
>>> --- a/src/gallium/drivers/radeon/r600_pipe_common.h
>>> +++ b/src/gallium/drivers/radeon/r600_pipe_common.h
>>> @@ -71,6 +71,7 @@
>>>   #define DBG_NO_IR             (1 << 12)
>>>   #define DBG_NO_TGSI           (1 << 13)
>>>   #define DBG_NO_ASM            (1 << 14)
>>> +#define DBG_PREOPT_IR          (1 << 15)
>>>   /* Bits 21-31 are reserved for the r600g driver. */
>>>   /* features */
>>>   #define DBG_NO_ASYNC_DMA      (1llu << 32)
>>> diff --git a/src/gallium/drivers/radeonsi/si_shader.c
>>> b/src/gallium/drivers/radeonsi/si_shader.c
>>> index 8b524cf..d9ed6b2 100644
>>> --- a/src/gallium/drivers/radeonsi/si_shader.c
>>> +++ b/src/gallium/drivers/radeonsi/si_shader.c
>>> @@ -4092,7 +4092,7 @@ int si_compile_llvm(struct si_screen *sscreen,
>>>         if (r600_can_dump_shader(&sscreen->b, processor)) {
>>>                 fprintf(stderr, "radeonsi: Compiling shader %d\n",
>>> count);
>>>
>>> -               if (!(sscreen->b.debug_flags & DBG_NO_IR))
>>> +               if (!(sscreen->b.debug_flags & (DBG_NO_IR |
>>> DBG_PREOPT_IR)))
>>>                         LLVMDumpModule(mod);
>>>         }
>>>
>>> @@ -4178,6 +4178,12 @@ static int si_generate_gs_copy_shader(struct
>>> si_screen *sscreen,
>>>         si_llvm_export_vs(bld_base, outputs, gsinfo->num_outputs);
>>>
>>>         LLVMBuildRetVoid(bld_base->base.gallivm->builder);
>>> +
>>> +       /* Dump LLVM IR before any optimization passes */
>>> +       if (sscreen->b.debug_flags & DBG_PREOPT_IR &&
>>> +           r600_can_dump_shader(&sscreen->b, TGSI_PROCESSOR_GEOMETRY))
>>> +               LLVMDumpModule(bld_base->base.gallivm->module);
>>> +
>>>         radeon_llvm_finalize_module(&si_shader_ctx->radeon_bld);
>>>
>>>         if (dump)
>>> @@ -4385,9 +4391,15 @@ int si_shader_create(struct si_screen *sscreen,
>>> LLVMTargetMachineRef tm,
>>>         }
>>>
>>>         LLVMBuildRetVoid(bld_base->base.gallivm->builder);
>>> +       mod = bld_base->base.gallivm->module;
>>> +
>>> +       /* Dump LLVM IR before any optimization passes */
>>> +       if (sscreen->b.debug_flags & DBG_PREOPT_IR &&
>>> +           r600_can_dump_shader(&sscreen->b, si_shader_ctx.type))
>>> +               LLVMDumpModule(mod);
>>> +
>>
>>
>> Is there any reason not to add the dump in  radeon_llvm_finalize_module()
>> after PromoteMem2Reg has run?  This would make the output readable by llc
>> and then you would only need to add the dump call in one place.
>
>
> In addition to Michel's observation, that's not really possible anyway
> because all the passes are run at once from LLVMRunFunctionPassManager, the
> functions before just set things up.
>
> I did consider doing the dump from radeon_llvm_finalize_module, but the
> function doesn't have (and probably shouldn't have) the information needed
> to make the decision whether to dump or not, so IMO it's cleaner this way.

Alternatively, you can add a radeonsi wrapper function around
radeon_llvm_finalize_module which also does the dumping.

Marek