[Mesa-dev] [PATCH] radeonsi: implement legacy RCP behavior to fix Unreal engine demos

Jose Fonseca jfonseca at vmware.com
Thu Dec 4 10:53:44 PST 2014


I'm also concerned this sort of ad-hoc re-interpretation of opcode 
semantics will come to bytes us later, as different state trackers might 
want different semantics.

I think we might need to redefine TGSI_OPCODE_RCP opcode or introduce a 
TGSI_OPCODE_RCP_LEGACY opcode.

Also, do we know exactly what shape does this division by zero takes in 
the incoming GLSL shader?  For example, could this be a 0/0 caused by 
calling GLSL' normalize() on a zero-length vector, and should this 
special RCP be used exclusively on the lowering of that built-in function?

Jose


On 04/12/14 17:52, Marek Olšák wrote:
> Returning FLT_MAX instead of 0 also works, which is similar to another
> hw instruction: V_RCP_CLAMP_F32.
>
> The Unreal engine is a pretty big target with a lot of apps out there.
> I'm afraid a driconf option isn't feasible.
>
> Marek
>
> On Thu, Dec 4, 2014 at 5:39 PM, Roland Scheidegger <sroland at vmware.com> wrote:
>> Hmm I have to say I'm not really convinced of that solution. Because all
>> divs are lowered, this will screw the results of all divs (if the rcp
>> would come from a legacy arb_fp rcp that would be different and quite
>> possible some apps depending on it, problems like that are very common
>> for d3d9 apps too). But really there's some expectations stuff conforms
>> to ieee754 rules these days, and making divs by zero return 0 ain't so hot.
>> Maybe it's the div lowering itself which causes this, in which case it
>> should probably be disabled? Might be a good idea anyway (if the driver
>> supports native div) since rcp isn't accurate usually.
>> Difficult to tell though without seeing the glsl and tgsi shader. But if
>> it was really the app expecting zero out of a div by zero (but I have
>> doubts about that), I'd certainly classify that as an app bug, and any
>> workarounds only be enabled by some dri conf option.
>>
>> Roland
>>
>>
>>
>> Am 04.12.2014 um 13:34 schrieb Marek Olšák:
>>> From: Marek Olšák <marek.olsak at amd.com>
>>>
>>> Discussion: https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.freedesktop.org_show-5Fbug.cgi-3Fid-3D83510-23c8&d=AAIGaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=zfmBZnnVGHeYde45pMKNnVyzeaZbdIqVLprmZCM2zzE&m=5VBoEca5PN7fZhaOGG8S3HeaGc1EQooZ2Ud4WZaehnQ&s=mL0Xf45D0QZ5Fb0AqLTlumjLLRA2A5wP3C1bU7UrapI&e=
>>> ---
>>>   src/gallium/drivers/radeonsi/si_shader.c | 27 +++++++++++++++++++++++++++
>>>   1 file changed, 27 insertions(+)
>>>
>>> diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c
>>> index 973bac2..e0799c9 100644
>>> --- a/src/gallium/drivers/radeonsi/si_shader.c
>>> +++ b/src/gallium/drivers/radeonsi/si_shader.c
>>> @@ -2744,6 +2744,32 @@ static int si_generate_gs_copy_shader(struct si_screen *sscreen,
>>>        return r;
>>>   }
>>>
>>> +/**
>>> + * This emulates V_RCP_LEGACY_F32, which has the following rule for division
>>> + * by zero: 1 / 0 = 0
>>> + *
>>> + * V_RCP_F32(x) = 1 / x
>>> + * V_RCP_LEGACY_F32(x) = (x != +-0) ? V_RCP_F32(x) : 0.
>>> + */
>>> +static void si_llvm_emit_rcp_legacy(const struct lp_build_tgsi_action * action,
>>> +                                 struct lp_build_tgsi_context * bld_base,
>>> +                                 struct lp_build_emit_data * emit_data)
>>> +{
>>> +     LLVMValueRef cmp =
>>> +             lp_build_cmp(&bld_base->base,
>>> +                          PIPE_FUNC_NOTEQUAL,
>>> +                          emit_data->args[0],
>>> +                          bld_base->base.zero);
>>> +
>>> +     LLVMValueRef div =
>>> +             lp_build_emit_llvm_binary(bld_base, TGSI_OPCODE_DIV,
>>> +                                       bld_base->base.one,
>>> +                                       emit_data->args[0]);
>>> +
>>> +     emit_data->output[emit_data->chan] =
>>> +             lp_build_select(&bld_base->base, cmp, div, bld_base->base.zero);
>>> +}
>>> +
>>>   int si_shader_create(struct si_screen *sscreen, struct si_shader *shader)
>>>   {
>>>        struct si_shader_selector *sel = shader->selector;
>>> @@ -2798,6 +2824,7 @@ int si_shader_create(struct si_screen *sscreen, struct si_shader *shader)
>>>                bld_base->op_actions[TGSI_OPCODE_MIN].emit = build_tgsi_intrinsic_nomem;
>>>                bld_base->op_actions[TGSI_OPCODE_MIN].intr_name = "llvm.minnum.f32";
>>>        }
>>> +     bld_base->op_actions[TGSI_OPCODE_RCP].emit = si_llvm_emit_rcp_legacy;
>>>
>>>        si_shader_ctx.radeon_bld.load_system_value = declare_system_value;
>>>        si_shader_ctx.tokens = sel->tokens;
>>>
>>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.freedesktop.org_mailman_listinfo_mesa-2Ddev&d=AAIGaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=zfmBZnnVGHeYde45pMKNnVyzeaZbdIqVLprmZCM2zzE&m=5VBoEca5PN7fZhaOGG8S3HeaGc1EQooZ2Ud4WZaehnQ&s=_sTrnx12zMjNK82ioPUvQCC1Zp0Syl0--cWB40-ihqc&e=
>



More information about the mesa-dev mailing list