<html>
<head>
<base href="https://bugs.freedesktop.org/" />
</head>
<body>
<p>
<div>
<b><a class="bz_bug_link
bz_status_NEW "
title="NEW - Add FP64 support to the i965 shader backends"
href="https://bugs.freedesktop.org/show_bug.cgi?id=92760#c28">Comment # 28</a>
on <a class="bz_bug_link
bz_status_NEW "
title="NEW - Add FP64 support to the i965 shader backends"
href="https://bugs.freedesktop.org/show_bug.cgi?id=92760">bug 92760</a>
from <span class="vcard"><a class="email" href="mailto:siglesias@igalia.com" title="Samuel Iglesias <siglesias@igalia.com>"> <span class="fn">Samuel Iglesias</span></a>
</span></b>
<pre>(In reply to Jason Ekstrand from <a href="show_bug.cgi?id=92760#c27">comment #27</a>)
<span class="quote">> (In reply to Samuel Iglesias from <a href="show_bug.cgi?id=92760#c26">comment #26</a>)
> > I found an issue related to having interleaved uniform definitions of 32-bit
> > and 64-bit data types in the push constant buffer.
> >
> > The bug is easily shown when we have a double defined just after a 32-bit
> > data type. For example, we have following definition in a GLSL fragment
> > shader:
> >
> > uniform double arg0;
> > uniform bool arg1;
> > uniform double arg2;
> >
> > The generated code that copies those push constant values does the following
> > in SIMD16:
> >
> > mov(8) g19<1>DF g2<0,1,0>DF
> > mov(8) g23<1>DF g2<0,1,0>DF
> > mov(16) g9<1>D g2.2<0,1,0>D
> > mov(8) g5<1>DF g2.1<0,1,0>DF
> > mov(8) g7<1>DF g2.1<0,1,0>DF
> >
> > As you see, there is a misalignment in the memory access that copies 'arg2'
> > contents: we are copying the 32 bits of arg1 into the copy of arg2 (notice
> > that g2.1<0,1,0>DF is at the same offset than g2.2<0,1,0>D).
>
> This issue was anticipated. We came across it in theory if not in practice
> this summer while Connor was working on it.
>
> > My proposal is to do a 64-bit alignment when uploading push constant doubles
> > and when reading them from the push constant buffer. The 32-bit push
> > constants' upload and access would not be changed. So the generated code for
> > the same example would be like:
> >
> > mov(8) g19<1>DF g2<0,1,0>DF
> > mov(8) g23<1>DF g2<0,1,0>DF
> > mov(16) g9<1>D g2.2<0,1,0>D
> > mov(8) g5<1>DF g2.2<0,1,0>DF
> > mov(8) g7<1>DF g2.2<0,1,0>DF
> >
> > This solution has the drawback of adding padding inside push constant buffer
> > when we have a mixture of 32 bits and 64-bit data type constants, so it is
> > not memory efficient; plus take it into account to avoid exceeding the push
> > buffer size limitation. The advantage is that it does not add new
> > instructions in the generated code.
> >
> > Do you like the proposed solution? Or do you have other solution in mind?
>
> That seems like what we need to do. Unfortunately, executing it might be a
> bit interesting. The uniform packing code we have
> (assign_constant_locations) isn't aware of the base data type. However, you
> do have the type on the source, so you can probably get it. You may want to
> take a look at this series (which still needs review)
> <a href="http://patchwork.freedesktop.org/series/1669/">http://patchwork.freedesktop.org/series/1669/</a> It addresses some of the same
> problems you'll need to solve but for a different reason.
> </span >
OK, thanks for the tips. I will take a look at that patch series.
<span class="quote">> > BTW, I expect to have a similar problem when reading doubles from the pull
> > constant buffer contents but I have not checked it yet.
>
> No, that shouldn't be a problem. We will need to maybe emit two pulls for a
> whole dvec4, but that's about it. There should be no alignment problems.</span >
OK :-)
Thanks Jason!</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are the QA Contact for the bug.</li>
</ul>
</body>
</html>