[Mesa-dev] [PATCH] radeonsi: implement legacy RCP behavior to fix Unreal engine demos
Marek Olšák
maraeo at gmail.com
Thu Dec 4 09:52:24 PST 2014
Returning FLT_MAX instead of 0 also works, which is similar to another
hw instruction: V_RCP_CLAMP_F32.
The Unreal engine is a pretty big target with a lot of apps out there.
I'm afraid a driconf option isn't feasible.
Marek
On Thu, Dec 4, 2014 at 5:39 PM, Roland Scheidegger <sroland at vmware.com> wrote:
> Hmm I have to say I'm not really convinced of that solution. Because all
> divs are lowered, this will screw the results of all divs (if the rcp
> would come from a legacy arb_fp rcp that would be different and quite
> possible some apps depending on it, problems like that are very common
> for d3d9 apps too). But really there's some expectations stuff conforms
> to ieee754 rules these days, and making divs by zero return 0 ain't so hot.
> Maybe it's the div lowering itself which causes this, in which case it
> should probably be disabled? Might be a good idea anyway (if the driver
> supports native div) since rcp isn't accurate usually.
> Difficult to tell though without seeing the glsl and tgsi shader. But if
> it was really the app expecting zero out of a div by zero (but I have
> doubts about that), I'd certainly classify that as an app bug, and any
> workarounds only be enabled by some dri conf option.
>
> Roland
>
>
>
> Am 04.12.2014 um 13:34 schrieb Marek Olšák:
>> From: Marek Olšák <marek.olsak at amd.com>
>>
>> Discussion: https://bugs.freedesktop.org/show_bug.cgi?id=83510#c8
>> ---
>> src/gallium/drivers/radeonsi/si_shader.c | 27 +++++++++++++++++++++++++++
>> 1 file changed, 27 insertions(+)
>>
>> diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c
>> index 973bac2..e0799c9 100644
>> --- a/src/gallium/drivers/radeonsi/si_shader.c
>> +++ b/src/gallium/drivers/radeonsi/si_shader.c
>> @@ -2744,6 +2744,32 @@ static int si_generate_gs_copy_shader(struct si_screen *sscreen,
>> return r;
>> }
>>
>> +/**
>> + * This emulates V_RCP_LEGACY_F32, which has the following rule for division
>> + * by zero: 1 / 0 = 0
>> + *
>> + * V_RCP_F32(x) = 1 / x
>> + * V_RCP_LEGACY_F32(x) = (x != +-0) ? V_RCP_F32(x) : 0.
>> + */
>> +static void si_llvm_emit_rcp_legacy(const struct lp_build_tgsi_action * action,
>> + struct lp_build_tgsi_context * bld_base,
>> + struct lp_build_emit_data * emit_data)
>> +{
>> + LLVMValueRef cmp =
>> + lp_build_cmp(&bld_base->base,
>> + PIPE_FUNC_NOTEQUAL,
>> + emit_data->args[0],
>> + bld_base->base.zero);
>> +
>> + LLVMValueRef div =
>> + lp_build_emit_llvm_binary(bld_base, TGSI_OPCODE_DIV,
>> + bld_base->base.one,
>> + emit_data->args[0]);
>> +
>> + emit_data->output[emit_data->chan] =
>> + lp_build_select(&bld_base->base, cmp, div, bld_base->base.zero);
>> +}
>> +
>> int si_shader_create(struct si_screen *sscreen, struct si_shader *shader)
>> {
>> struct si_shader_selector *sel = shader->selector;
>> @@ -2798,6 +2824,7 @@ int si_shader_create(struct si_screen *sscreen, struct si_shader *shader)
>> bld_base->op_actions[TGSI_OPCODE_MIN].emit = build_tgsi_intrinsic_nomem;
>> bld_base->op_actions[TGSI_OPCODE_MIN].intr_name = "llvm.minnum.f32";
>> }
>> + bld_base->op_actions[TGSI_OPCODE_RCP].emit = si_llvm_emit_rcp_legacy;
>>
>> si_shader_ctx.radeon_bld.load_system_value = declare_system_value;
>> si_shader_ctx.tokens = sel->tokens;
>>
>
More information about the mesa-dev
mailing list