[Bug 92760] Add FP64 support to the i965 shader backends

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Mon Mar 14 15:05:51 UTC 2016


--- Comment #64 from Iago Toral <itoral at igalia.com> ---
since we imagine that reviewing the gen8+ fp64 implementation is going to take
a while, we have already started implementing support for gen7. With a few
fixes to handle 64-bit immediates (unsupported by gen7 hardware) and correct
some bugs in copy-propagation triggered by the implementation of this, gen7 is
already looking pretty good, with only 92 fails out of ~2900 tests, however,
there is a major issue that we have bumped into: apparently gen7 does not like
writes with a stride > 1, which we need to do all the time in fp64.

The Haswell PRM (vol7, 3D Media GPGPU Engine, Register Region Restrictions),
says the following:

"When destination spans two registers, the source MUST span two registers."

Which is not present in the Broadwell PRMs. Unfortunately, it looks like
changing things to obey this restriction does not fix anything. We bumped into
this while implementing support for 64-bit immediates. Our initial
implementation would do something like this (pseudo-code):

fs_reg setup_imm_df(double v) {
   vgrf<double> tmp;
   tmp = retype(tmp, unsigned)

   vgrf<double> v_low, v_high;
   v_low = retype(tmp, unsigned)
   v_high = retype(tmp, unsigned)

   mov(v_low, brw_imm_ud(low32(v)));
   mov(v_high, brw_imm_ud(high32(v)));

   mov(stride(tmp, 2), stride(v_low, 2));
   mov(stride(horiz_offset(tmp, 1), 2), stride(v_high, 2));

   return retype(tmp, double)

That implementation respects the HSW restriction for writes that span 2
registers, however I found the second SIMD register (reg_offset 1) for tmp is
never written. Of course, we don't need to do this for immediates, we can just
return a stride 0 region for tmp and that works fine, but this going to be a
problem everywhere else.

I am not sure what to do about this, since I did not find anything useful in
the PRMs to explain what could be going on other than that restriction which we
are already obeying. Do you have any ideas? If not, would it be possible that
someone at intel runs this through the simulator (I'd provide the aub trace
file) to check if that gives any clues?

You are receiving this mail because:
You are the QA Contact for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-3d-bugs/attachments/20160314/0eb6cd2b/attachment.html>

More information about the intel-3d-bugs mailing list