<html>
<head>
<base href="https://bugs.freedesktop.org/" />
</head>
<body>
<p>
<div>
<b><a class="bz_bug_link
bz_status_NEW "
title="NEW - Add FP64 support to the i965 shader backends"
href="https://bugs.freedesktop.org/show_bug.cgi?id=92760#c76">Comment # 76</a>
on <a class="bz_bug_link
bz_status_NEW "
title="NEW - Add FP64 support to the i965 shader backends"
href="https://bugs.freedesktop.org/show_bug.cgi?id=92760">bug 92760</a>
from <span class="vcard"><a class="email" href="mailto:itoral@igalia.com" title="Iago Toral <itoral@igalia.com>"> <span class="fn">Iago Toral</span></a>
</span></b>
<pre>(In reply to Francisco Jerez from <a href="show_bug.cgi?id=92760#c73">comment #73</a>)
<span class="quote">> (In reply to Jason Ekstrand from <a href="show_bug.cgi?id=92760#c72">comment #72</a>)
> > (In reply to Iago Toral from <a href="show_bug.cgi?id=92760#c66">comment #66</a>)
> > > Jason, Connor:
> > >
> > > last week Curro spent some time looking at our fp64 branch and testing some
> > > things and we have been discussing some aspects of the hardware in fp64 that
> > > are not all that well documented (or not even documented at all :)) and that
> > > may have some important implications in the implementation, specifically for
> > > the vec4 backend.
> > >
> > > Opinions?
> >
> > Short version:
> >
> > a) The hardware is busted.
> > b) I think Curro knows what he's talking about. :-)
> >
> > Longer version:
> >
> > I see a couple of options here: One is to just scalarize all double stuff.
> > On Ivy Bridge, I think you would also have to double up instructions, one
> > for each half. It's not a great option from the perspective of performance
> > but is perhaps the easiest to implement.
> >
> > The second option is what Curro's suggesting where you try and use the
> > hardware as much as possible and fall back to nasty things only when you
> > have to. Unfortunately, this is going to cause a lot of pain in the
> > generator because suddenly lots of stuff may become align1 at least on IVB.
> >
> Just a short comment on this point: I don't think IVB will be much worse.
> From the functional point of view it's not that much different from HSW+,
> its primary limitation is that Align16 FP64 instructions can't do more than
> one dvec4 at a time, but that's easily solvable by hooking up the SIMD width
> lowering pass (in addition to the swizzle and writemask lowering pass that
> could be used on later gens), because NibCtrl behaves as expected on FP64
> Align16 instructions even on IVB thankfully.</span >
I think there is still a problem with this: the fact that NibCtrl only works
with DF instructions, but we would still need to deal with UD access to DF
data... wouldn't that be broken in this case?</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are the QA Contact for the bug.</li>
</ul>
</body>
</html>