[Bug 92760] Add FP64 support to the i965 shader backends
bugzilla-daemon at freedesktop.org
bugzilla-daemon at freedesktop.org
Thu Jan 7 22:36:10 PST 2016
https://bugs.freedesktop.org/show_bug.cgi?id=92760
--- Comment #27 from Jason Ekstrand <jason at jlekstrand.net> ---
(In reply to Samuel Iglesias from comment #26)
> I found an issue related to having interleaved uniform definitions of 32-bit
> and 64-bit data types in the push constant buffer.
>
> The bug is easily shown when we have a double defined just after a 32-bit
> data type. For example, we have following definition in a GLSL fragment
> shader:
>
> uniform double arg0;
> uniform bool arg1;
> uniform double arg2;
>
> The generated code that copies those push constant values does the following
> in SIMD16:
>
> mov(8) g19<1>DF g2<0,1,0>DF
> mov(8) g23<1>DF g2<0,1,0>DF
> mov(16) g9<1>D g2.2<0,1,0>D
> mov(8) g5<1>DF g2.1<0,1,0>DF
> mov(8) g7<1>DF g2.1<0,1,0>DF
>
> As you see, there is a misalignment in the memory access that copies 'arg2'
> contents: we are copying the 32 bits of arg1 into the copy of arg2 (notice
> that g2.1<0,1,0>DF is at the same offset than g2.2<0,1,0>D).
This issue was anticipated. We came across it in theory if not in practice
this summer while Connor was working on it.
> My proposal is to do a 64-bit alignment when uploading push constant doubles
> and when reading them from the push constant buffer. The 32-bit push
> constants' upload and access would not be changed. So the generated code for
> the same example would be like:
>
> mov(8) g19<1>DF g2<0,1,0>DF
> mov(8) g23<1>DF g2<0,1,0>DF
> mov(16) g9<1>D g2.2<0,1,0>D
> mov(8) g5<1>DF g2.2<0,1,0>DF
> mov(8) g7<1>DF g2.2<0,1,0>DF
>
> This solution has the drawback of adding padding inside push constant buffer
> when we have a mixture of 32 bits and 64-bit data type constants, so it is
> not memory efficient; plus take it into account to avoid exceeding the push
> buffer size limitation. The advantage is that it does not add new
> instructions in the generated code.
>
> Do you like the proposed solution? Or do you have other solution in mind?
That seems like what we need to do. Unfortunately, executing it might be a bit
interesting. The uniform packing code we have (assign_constant_locations)
isn't aware of the base data type. However, you do have the type on the
source, so you can probably get it. You may want to take a look at this series
(which still needs review) http://patchwork.freedesktop.org/series/1669/ It
addresses some of the same problems you'll need to solve but for a different
reason.
> BTW, I expect to have a similar problem when reading doubles from the pull
> constant buffer contents but I have not checked it yet.
No, that shouldn't be a problem. We will need to maybe emit two pulls for a
whole dvec4, but that's about it. There should be no alignment problems.
--
You are receiving this mail because:
You are the QA Contact for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/intel-3d-bugs/attachments/20160108/28c5855b/attachment.html>
More information about the intel-3d-bugs
mailing list