<html>
<head>
<base href="https://bugs.freedesktop.org/" />
</head>
<body>
<p>
<div>
<b><a class="bz_bug_link
bz_status_NEW "
title="NEW - Add FP64 support to the i965 shader backends"
href="https://bugs.freedesktop.org/show_bug.cgi?id=92760#c61">Comment # 61</a>
on <a class="bz_bug_link
bz_status_NEW "
title="NEW - Add FP64 support to the i965 shader backends"
href="https://bugs.freedesktop.org/show_bug.cgi?id=92760">bug 92760</a>
from <span class="vcard"><a class="email" href="mailto:jason@jlekstrand.net" title="Jason Ekstrand <jason@jlekstrand.net>"> <span class="fn">Jason Ekstrand</span></a>
</span></b>
<pre>(In reply to Iago Toral from <a href="show_bug.cgi?id=92760#c59">comment #59</a>)
<span class="quote">> I think the fp64 is now mostly complete, in the last days we have been going
> through the patch set cleaning things up a bit in preparation for review,
> the result is in this branch:
>
> <a href="https://github.com/Igalia/mesa/tree/i965-fp64">https://github.com/Igalia/mesa/tree/i965-fp64</a>
>
> FYI, we have also expanded the fp64 piglit tests to cover things that we
> found weren't covered by existing tests. There is a branch with these here:
>
> <a href="https://github.com/Igalia/piglit/tree/arb_gpu_shader_fp64">https://github.com/Igalia/piglit/tree/arb_gpu_shader_fp64</a>
>
> It still needs a rebase and some clean up work, but we should be sending
> this for review soon too.
>
> As it is now, we pass all the fp64 tests, except for a few that fail because
> of spilling, that could probably take some optimization work to fix. We also
> have no regressions in non-double funcionality in any hardware generation
> that we have available (gen5 to gen9).</span >
Yes, spilling is important but I think we can start landing things before we
get that 100% fixed.
<span class="quote">> There are still a couple of things that we need to confirm so we can squash
> or remove a few patches in the series:
>
> 1. We need to make a decision about d2b. We discussed in the list that f2b
> does not need to handle denorms and neither does NVIDIA. On the other hand,
> NVIDIA does handle d2b with denorms (that is, it returns true for them).
> Since doubles are more about precission than performance this could make
> sense for us too. If we want this behavior I have a patch for this in the
> series, following a similar approach to the one I sento to the list for f2b,
> (I should get rid of the resolve_source_modifiers() for gen8 as Jason
> mentioned for the f2b case):
>
> <a href="https://github.com/Igalia/mesa/commit/">https://github.com/Igalia/mesa/commit/</a>
> df6c1d940be455d0e40d9eee2cb0e8987e4baa16</span >
I think the spec allows for a lot of wiggle room. With doubles, it may be
easier to implement with bitwise operations like you did in that f2b patch.
Apps shouldn't count on any particular behaviour there other than 0.0 -> false
and obviously non-zero -> true. Any app that counts on the behaviour of d2b on
denorms is going to be broken.
<span class="quote">> 2. We need to make a decision regarding the scalarization of the lowerings
> of trunc() and roundEven() in NIR to use if statements instead of bcsel. For
> scalar this seems like a possible win, but for vec4 it probably isn't since
> we lose the ability to do vector instructions. See my previous comment for
> more details.</span >
See above.
<span class="quote">> With those two questions answered we would have everything ready for review,
> the only important thing missing I can think of is spilling. As I mentioned
> above there are a few piglit tests that fail because of this, I think this
> might be a combination of the spilling code not being able to spill
> registers with a size > 1 (at least for vec4) and some needed optimization
> work, but I think it is probably a good idea to start the review while we
> look into this.
>
> With the two questions above pending, the series should be 190+ patches
> long, so it is pretty big. The composition looks more or less like this:
>
> 1) execsize/width fixes: 9 patches (the ones we sent in December to the list)
> 2) NIR: ~60 patches
> 3) i965: ~130 patches
>
> Notice that some of the i965 patches are _required_ after the NIR patches so
> that we don't break things in the driver because NIR introduces sized types
> that drivers need to handle, even if we don't care about fp64 yet.
>
> Also, as soon as we get sized types in NIR we are going to break drivers
> that don't support sized types (freedreno/vc4). I suppose we should ping
> Eric and Rob and point them at our branch so they can have some time to fix
> their drivers accordingly. We did fix some obvious things for them, but they
> should run piglit and probably fix some more stuff.</span >
That should happen sooner rather than later.
<span class="quote">> I suppose we could try to break the series into smaller chunks that make
> sense to review individually. Initially I was thinking that we might review
> the NIR stuff before the i965 bits... maybe that's a reasonable way to start
> with this, including the minimum i965 stuff required to not break non-double
> things.</span >
I think the thing to do there would be to break at least one chunk off.
Namely, the bare minimum NIR and i965 changes required to support the new sized
types. That will all have to be squashed into a single commit along with the
vc4 and freedreno changes. We should get that out ASAP so that Rob and Eric
have a chance to fix their drivers so we aren't waiting on them before we can
start landing the rest of it.
Beyond that, I don't really care how it's split up. The NIR pass for doing
double lowering can be it's own thing. I guess fs and vec4 could be separate.
I'll leave it up to you how you want to do that. It's the same set of patches
at the end of the day.</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are the QA Contact for the bug.</li>
</ul>
</body>
</html>