[Mesa-dev] [PATCH] radeonsi: implement legacy RCP behavior to fix Unreal engine demos
Jose Fonseca
jfonseca at vmware.com
Thu Dec 4 10:53:44 PST 2014
I'm also concerned this sort of ad-hoc re-interpretation of opcode
semantics will come to bytes us later, as different state trackers might
want different semantics.
I think we might need to redefine TGSI_OPCODE_RCP opcode or introduce a
TGSI_OPCODE_RCP_LEGACY opcode.
Also, do we know exactly what shape does this division by zero takes in
the incoming GLSL shader? For example, could this be a 0/0 caused by
calling GLSL' normalize() on a zero-length vector, and should this
special RCP be used exclusively on the lowering of that built-in function?
Jose
On 04/12/14 17:52, Marek Olšák wrote:
> Returning FLT_MAX instead of 0 also works, which is similar to another
> hw instruction: V_RCP_CLAMP_F32.
>
> The Unreal engine is a pretty big target with a lot of apps out there.
> I'm afraid a driconf option isn't feasible.
>
> Marek
>
> On Thu, Dec 4, 2014 at 5:39 PM, Roland Scheidegger <sroland at vmware.com> wrote:
>> Hmm I have to say I'm not really convinced of that solution. Because all
>> divs are lowered, this will screw the results of all divs (if the rcp
>> would come from a legacy arb_fp rcp that would be different and quite
>> possible some apps depending on it, problems like that are very common
>> for d3d9 apps too). But really there's some expectations stuff conforms
>> to ieee754 rules these days, and making divs by zero return 0 ain't so hot.
>> Maybe it's the div lowering itself which causes this, in which case it
>> should probably be disabled? Might be a good idea anyway (if the driver
>> supports native div) since rcp isn't accurate usually.
>> Difficult to tell though without seeing the glsl and tgsi shader. But if
>> it was really the app expecting zero out of a div by zero (but I have
>> doubts about that), I'd certainly classify that as an app bug, and any
>> workarounds only be enabled by some dri conf option.
>>
>> Roland
>>
>>
>>
>> Am 04.12.2014 um 13:34 schrieb Marek Olšák:
>>> From: Marek Olšák <marek.olsak at amd.com>
>>>
>>> Discussion: https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.freedesktop.org_show-5Fbug.cgi-3Fid-3D83510-23c8&d=AAIGaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=zfmBZnnVGHeYde45pMKNnVyzeaZbdIqVLprmZCM2zzE&m=5VBoEca5PN7fZhaOGG8S3HeaGc1EQooZ2Ud4WZaehnQ&s=mL0Xf45D0QZ5Fb0AqLTlumjLLRA2A5wP3C1bU7UrapI&e=
>>> ---
>>> src/gallium/drivers/radeonsi/si_shader.c | 27 +++++++++++++++++++++++++++
>>> 1 file changed, 27 insertions(+)
>>>
>>> diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c
>>> index 973bac2..e0799c9 100644
>>> --- a/src/gallium/drivers/radeonsi/si_shader.c
>>> +++ b/src/gallium/drivers/radeonsi/si_shader.c
>>> @@ -2744,6 +2744,32 @@ static int si_generate_gs_copy_shader(struct si_screen *sscreen,
>>> return r;
>>> }
>>>
>>> +/**
>>> + * This emulates V_RCP_LEGACY_F32, which has the following rule for division
>>> + * by zero: 1 / 0 = 0
>>> + *
>>> + * V_RCP_F32(x) = 1 / x
>>> + * V_RCP_LEGACY_F32(x) = (x != +-0) ? V_RCP_F32(x) : 0.
>>> + */
>>> +static void si_llvm_emit_rcp_legacy(const struct lp_build_tgsi_action * action,
>>> + struct lp_build_tgsi_context * bld_base,
>>> + struct lp_build_emit_data * emit_data)
>>> +{
>>> + LLVMValueRef cmp =
>>> + lp_build_cmp(&bld_base->base,
>>> + PIPE_FUNC_NOTEQUAL,
>>> + emit_data->args[0],
>>> + bld_base->base.zero);
>>> +
>>> + LLVMValueRef div =
>>> + lp_build_emit_llvm_binary(bld_base, TGSI_OPCODE_DIV,
>>> + bld_base->base.one,
>>> + emit_data->args[0]);
>>> +
>>> + emit_data->output[emit_data->chan] =
>>> + lp_build_select(&bld_base->base, cmp, div, bld_base->base.zero);
>>> +}
>>> +
>>> int si_shader_create(struct si_screen *sscreen, struct si_shader *shader)
>>> {
>>> struct si_shader_selector *sel = shader->selector;
>>> @@ -2798,6 +2824,7 @@ int si_shader_create(struct si_screen *sscreen, struct si_shader *shader)
>>> bld_base->op_actions[TGSI_OPCODE_MIN].emit = build_tgsi_intrinsic_nomem;
>>> bld_base->op_actions[TGSI_OPCODE_MIN].intr_name = "llvm.minnum.f32";
>>> }
>>> + bld_base->op_actions[TGSI_OPCODE_RCP].emit = si_llvm_emit_rcp_legacy;
>>>
>>> si_shader_ctx.radeon_bld.load_system_value = declare_system_value;
>>> si_shader_ctx.tokens = sel->tokens;
>>>
>>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.freedesktop.org_mailman_listinfo_mesa-2Ddev&d=AAIGaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=zfmBZnnVGHeYde45pMKNnVyzeaZbdIqVLprmZCM2zzE&m=5VBoEca5PN7fZhaOGG8S3HeaGc1EQooZ2Ud4WZaehnQ&s=_sTrnx12zMjNK82ioPUvQCC1Zp0Syl0--cWB40-ihqc&e=
>
More information about the mesa-dev
mailing list