[Mesa-dev] [PATCH] radeonsi: implement legacy RCP behavior to fix Unreal engine demos

Marek Olšák maraeo at gmail.com
Thu Dec 4 09:52:24 PST 2014


Returning FLT_MAX instead of 0 also works, which is similar to another
hw instruction: V_RCP_CLAMP_F32.

The Unreal engine is a pretty big target with a lot of apps out there.
I'm afraid a driconf option isn't feasible.

Marek

On Thu, Dec 4, 2014 at 5:39 PM, Roland Scheidegger <sroland at vmware.com> wrote:
> Hmm I have to say I'm not really convinced of that solution. Because all
> divs are lowered, this will screw the results of all divs (if the rcp
> would come from a legacy arb_fp rcp that would be different and quite
> possible some apps depending on it, problems like that are very common
> for d3d9 apps too). But really there's some expectations stuff conforms
> to ieee754 rules these days, and making divs by zero return 0 ain't so hot.
> Maybe it's the div lowering itself which causes this, in which case it
> should probably be disabled? Might be a good idea anyway (if the driver
> supports native div) since rcp isn't accurate usually.
> Difficult to tell though without seeing the glsl and tgsi shader. But if
> it was really the app expecting zero out of a div by zero (but I have
> doubts about that), I'd certainly classify that as an app bug, and any
> workarounds only be enabled by some dri conf option.
>
> Roland
>
>
>
> Am 04.12.2014 um 13:34 schrieb Marek Olšák:
>> From: Marek Olšák <marek.olsak at amd.com>
>>
>> Discussion: https://bugs.freedesktop.org/show_bug.cgi?id=83510#c8
>> ---
>>  src/gallium/drivers/radeonsi/si_shader.c | 27 +++++++++++++++++++++++++++
>>  1 file changed, 27 insertions(+)
>>
>> diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c
>> index 973bac2..e0799c9 100644
>> --- a/src/gallium/drivers/radeonsi/si_shader.c
>> +++ b/src/gallium/drivers/radeonsi/si_shader.c
>> @@ -2744,6 +2744,32 @@ static int si_generate_gs_copy_shader(struct si_screen *sscreen,
>>       return r;
>>  }
>>
>> +/**
>> + * This emulates V_RCP_LEGACY_F32, which has the following rule for division
>> + * by zero: 1 / 0 = 0
>> + *
>> + * V_RCP_F32(x) = 1 / x
>> + * V_RCP_LEGACY_F32(x) = (x != +-0) ? V_RCP_F32(x) : 0.
>> + */
>> +static void si_llvm_emit_rcp_legacy(const struct lp_build_tgsi_action * action,
>> +                                 struct lp_build_tgsi_context * bld_base,
>> +                                 struct lp_build_emit_data * emit_data)
>> +{
>> +     LLVMValueRef cmp =
>> +             lp_build_cmp(&bld_base->base,
>> +                          PIPE_FUNC_NOTEQUAL,
>> +                          emit_data->args[0],
>> +                          bld_base->base.zero);
>> +
>> +     LLVMValueRef div =
>> +             lp_build_emit_llvm_binary(bld_base, TGSI_OPCODE_DIV,
>> +                                       bld_base->base.one,
>> +                                       emit_data->args[0]);
>> +
>> +     emit_data->output[emit_data->chan] =
>> +             lp_build_select(&bld_base->base, cmp, div, bld_base->base.zero);
>> +}
>> +
>>  int si_shader_create(struct si_screen *sscreen, struct si_shader *shader)
>>  {
>>       struct si_shader_selector *sel = shader->selector;
>> @@ -2798,6 +2824,7 @@ int si_shader_create(struct si_screen *sscreen, struct si_shader *shader)
>>               bld_base->op_actions[TGSI_OPCODE_MIN].emit = build_tgsi_intrinsic_nomem;
>>               bld_base->op_actions[TGSI_OPCODE_MIN].intr_name = "llvm.minnum.f32";
>>       }
>> +     bld_base->op_actions[TGSI_OPCODE_RCP].emit = si_llvm_emit_rcp_legacy;
>>
>>       si_shader_ctx.radeon_bld.load_system_value = declare_system_value;
>>       si_shader_ctx.tokens = sel->tokens;
>>
>


More information about the mesa-dev mailing list