<div dir="ltr"><div>I've sent a nit or two but the first 11 patches are</div><div><br></div><div>Reviewed-by: Jason Ekstrand <<a href="mailto:jason@jlekstrand.net">jason@jlekstrand.net</a>></div><div><br></div><div>I'd be very happy for you to push them ahead of the rest of the series so they don't end up in a v3 unless someone else requests significant changes.<br></div></div><br><div class="gmail_quote"><div dir="ltr">On Wed, Dec 19, 2018 at 5:51 AM Iago Toral Quiroga <<a href="mailto:itoral@igalia.com">itoral@igalia.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">The 16-bit polynomial execution doesn't meet Khronos precision requirements.<br>
Also, the half-float denorm range starts at 2^(-14) and with asin taking input<br>
values in the range [0, 1], polynomial approximations can lead to flushing<br>
relatively easy.<br>
<br>
An alternative is to use the atan2 formula to compute asin, which is the<br>
reference taken by Khronos to determine precision requirements, but that<br>
ends up generating too many additional instructions when compared to the<br>
polynomial approximation. Specifically, for the Intel case, doing this<br>
adds +41 instructions to the program for each asin/acos call, which looks<br>
like an undesirable trade off.<br>
<br>
So for now we take the easy way out and fallback to using the 32-bit<br>
polynomial approximation, which is better (faster) than the 16-bit atan2<br>
implementation and gives us better precision that matches Khronos<br>
requirements.<br>
<br>
v2:<br>
- Fallback to 32-bit using recursion (Jason).<br>
---<br>
src/compiler/spirv/vtn_glsl450.c | 14 ++++++++++++++<br>
1 file changed, 14 insertions(+)<br>
<br>
diff --git a/src/compiler/spirv/vtn_glsl450.c b/src/compiler/spirv/vtn_glsl450.c<br>
index 2540331b6cc..0a641077513 100644<br>
--- a/src/compiler/spirv/vtn_glsl450.c<br>
+++ b/src/compiler/spirv/vtn_glsl450.c<br>
@@ -202,6 +202,20 @@ build_log(nir_builder *b, nir_ssa_def *x)<br>
static nir_ssa_def *<br>
build_asin(nir_builder *b, nir_ssa_def *x, float p0, float p1)<br>
{<br>
+ if (x->bit_size == 16) {<br>
+ /* The polynomial approximation isn't precise enough to meet half-float<br>
+ * precision requirements. Alternatively, we could implement this using<br>
+ * the formula:<br>
+ *<br>
+ * asin(x) = atan2(x, sqrt(1 - x*x))<br>
+ *<br>
+ * But that is very expensive, so instead we just do the polynomial<br>
+ * approximation in 32-bit math and then we convert the result back to<br>
+ * 16-bit.<br>
+ */<br>
+ return nir_f2f16(b, build_asin(b, nir_f2f32(b, x), p0, p1));<br>
+ }<br>
+<br>
nir_ssa_def *one = nir_imm_floatN_t(b, 1.0f, x->bit_size);<br>
nir_ssa_def *abs_x = nir_fabs(b, x);<br>
<br>
-- <br>
2.17.1<br>
<br>
_______________________________________________<br>
mesa-dev mailing list<br>
<a href="mailto:mesa-dev@lists.freedesktop.org" target="_blank">mesa-dev@lists.freedesktop.org</a><br>
<a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev" rel="noreferrer" target="_blank">https://lists.freedesktop.org/mailman/listinfo/mesa-dev</a><br>
</blockquote></div>