[Mesa-dev] [PATCH] st/mesa: change SQRT lowering to fix the game Risen

Ilia Mirkin imirkin at alum.mit.edu
Sun Jun 5 15:17:53 UTC 2016


On Mon, May 30, 2016 at 7:19 PM, Marek Olšák <maraeo at gmail.com> wrote:
> From: Marek Olšák <marek.olsak at amd.com>
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94627
> (against nouveau)
> ---
>  src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 16 +++++++++-------
>  1 file changed, 9 insertions(+), 7 deletions(-)
>
> diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> index aa443a5..0f5ee02 100644
> --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
> @@ -1901,13 +1901,15 @@ glsl_to_tgsi_visitor::visit_expression(ir_expression* ir, st_src_reg *op)
>        if (have_sqrt) {
>           emit_scalar(ir, TGSI_OPCODE_SQRT, result_dst, op[0]);
>        } else {
> -         /* sqrt(x) = x * rsq(x). */
> -         emit_scalar(ir, TGSI_OPCODE_RSQ, result_dst, op[0]);
> -         emit_asm(ir, TGSI_OPCODE_MUL, result_dst, result_src, op[0]);
> -         /* For incoming channels <= 0, set the result to 0. */
> -         op[0].negate = ~op[0].negate;
> -         emit_asm(ir, TGSI_OPCODE_CMP, result_dst,
> -              op[0], result_src, st_src_reg_for_float(0.0));
> +         /* This is the only instruction sequence that makes the game "Risen"
> +          * render correctly. ABS is not required for the game, but since GLSL
> +          * declares negative values as "undefined", allowing us to do whatever
> +          * we want, I choose to use ABS to match DX9 and pre-GLSL RSQ
> +          * behavior.
> +          */
> +         emit_scalar(ir, TGSI_OPCODE_ABS, result_dst, op[0]);
> +         emit_scalar(ir, TGSI_OPCODE_RSQ, result_dst, result_src);
> +         emit_scalar(ir, TGSI_OPCODE_RCP, result_dst, result_src);

I spent a bunch of time trying to come up with alternatives that still
had the desired behavior for 0 and Infinity (since RCP sucks in terms
of precision). At the end of the day, it depends on what the hw does
with RSQ(0) and RCP(0). For nv50/nvc0, this sequence did have the
desired behavior - perhaps that holds for all hardware?

Either way, I'm fairly ambivalent about this - nv50/nvc0 now claim to
have SQRT support (and perform the lowering in the backend), and I
doubt much of anything would care on nv30. I'd just as soon leave the
abs out of it, but I don't strongly care.

Acked-by: Ilia Mirkin <imirkin at alum.mit.edu>

>        }
>        break;
>     case ir_unop_rsq:
> --
> 2.7.4
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


More information about the mesa-dev mailing list