<div dir="ltr"><div>I've sent a nit or two but the first 11 patches are</div><div><br></div><div>Reviewed-by: Jason Ekstrand <<a href="mailto:jason@jlekstrand.net">jason@jlekstrand.net</a>></div><div><br></div><div>I'd be very happy for you to push them ahead of the rest of the series so they don't end up in a v3 unless someone else requests significant changes.<br></div></div><br><div class="gmail_quote"><div dir="ltr">On Wed, Dec 19, 2018 at 5:51 AM Iago Toral Quiroga <<a href="mailto:itoral@igalia.com">itoral@igalia.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">The 16-bit polynomial execution doesn't meet Khronos precision requirements.<br> Also, the half-float denorm range starts at 2^(-14) and with asin taking input<br> values in the range [0, 1], polynomial approximations can lead to flushing<br> relatively easy.<br> <br> An alternative is to use the atan2 formula to compute asin, which is the<br> reference taken by Khronos to determine precision requirements, but that<br> ends up generating too many additional instructions when compared to the<br> polynomial approximation. Specifically, for the Intel case, doing this<br> adds +41 instructions to the program for each asin/acos call, which looks<br> like an undesirable trade off.<br> <br> So for now we take the easy way out and fallback to using the 32-bit<br> polynomial approximation, which is better (faster) than the 16-bit atan2<br> implementation and gives us better precision that matches Khronos<br> requirements.<br> <br> v2:<br> - Fallback to 32-bit using recursion (Jason).<br> ---<br> src/compiler/spirv/vtn_glsl450.c | 14 ++++++++++++++<br> 1 file changed, 14 insertions(+)<br> <br> diff --git a/src/compiler/spirv/vtn_glsl450.c b/src/compiler/spirv/vtn_glsl450.c<br> index 2540331b6cc..0a641077513 100644<br> --- a/src/compiler/spirv/vtn_glsl450.c<br> +++ b/src/compiler/spirv/vtn_glsl450.c<br> @@ -202,6 +202,20 @@ build_log(nir_builder *b, nir_ssa_def *x)<br> static nir_ssa_def *<br> build_asin(nir_builder *b, nir_ssa_def *x, float p0, float p1)<br> {<br> + if (x->bit_size == 16) {<br> + /* The polynomial approximation isn't precise enough to meet half-float<br> + * precision requirements. Alternatively, we could implement this using<br> + * the formula:<br> + *<br> + * asin(x) = atan2(x, sqrt(1 - x*x))<br> + *<br> + * But that is very expensive, so instead we just do the polynomial<br> + * approximation in 32-bit math and then we convert the result back to<br> + * 16-bit.<br> + */<br> + return nir_f2f16(b, build_asin(b, nir_f2f32(b, x), p0, p1));<br> + }<br> +<br> nir_ssa_def *one = nir_imm_floatN_t(b, 1.0f, x->bit_size);<br> nir_ssa_def *abs_x = nir_fabs(b, x);<br> <br> -- <br> 2.17.1<br> <br> _______________________________________________<br> mesa-dev mailing list<br> <a href="mailto:mesa-dev@lists.freedesktop.org" target="_blank">mesa-dev@lists.freedesktop.org</a><br> <a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev" rel="noreferrer" target="_blank">https://lists.freedesktop.org/mailman/listinfo/mesa-dev</a><br> </blockquote></div>