[Mesa-dev] [Bug 100613] Regression in Mesa 17 on s390x (zSystems)

Thu May 4 01:58:31 UTC 2017

https://bugs.freedesktop.org/show_bug.cgi?id=100613

--- Comment #20 from Ray Strode <halfline at gmail.com> ---
Hi,

> - if i leave vector_justify to FALSE, but change attachment 130980 
> to use fetch_width instead of format_desc->block.bits then all the
> 3 component sshort tests and half float tests start working, but then
> r32g32b32_sscaled start failing.  So, the above patch fudges it to make
> everything in draw-vertices and draw-vertices-half-float work.
Just to expand on this...  If we comment out the 3x32bit vector fetch special
case alluded to comment 10, and instead rely on a 96bit scalar fetch, (just as
we rely on a 48bit scalar fetch for r16g16b16), then using fetch_width instead
of format_desc->block.bits in the vec_nr computation works for the 32bit case,
too.  

So, Roland, your first intuition appears to be right, the scalar and vector
paths are different.  Probably attachment 131000 is wrong, or I mucked up the
testing, or something. But, of course, we should get the scalar case working
regardless...

For 3 r16g16b16 (well x16y16z16) vertices:

 short v3[] = {
   x1, y1, z1,
   x1, y2, z2,
   x2, y1, z3
 };

a scalar 48bit fetch leaves things like this:

packed[0] = [ pad x1 y1 z1 pad x1 y2 z2 ]
packed[1] = [ pad x2 y1 z3 pad x2 y1 z3 ]

(where pad means zeros from the zero-extend-to-64bits op)

which then get grouped into 32-bit quantities, merged and reordered like so:

dst[0] = [ [pad x1] [pad x1] [pad x2] [pad x2] ]
dst[1] = [ [ y1 z1] [ y2 z2] [ y1 z3] [ y1 z3] ]

in this layout, vec_nr needs to be 0, then 1, then 1 to get the correct values,
which is achieved with 

vec_nr = (fetch_width - (chan_desc.shift + chan_desc.size)) / type.width);

Of course, I think we want to unconditionally (on big endian) use: 

vec_nr = (format_desc->block.bits - (chan_desc.shift + chan_desc.size)) /
type.width;

instead, which leaves vec_nr as 0, then 0, then 1.  In order for that to work,
it requires dst[0]837060
 to look like:

dst[0] = [ [y1  x1]  [y2  x1]  [y1  x2]  [y1  x2] ]

We almost get there by using vector_justify = TRUE. it leaves us with:

packed[0] = [ x1  y1  z1 pad  x1  y2  z2 pad ]
packed[1] = [ x2  y1  z3 pad  x2  y1  z3 pad ]

dst[0] = [ [x1  y1] [x1  y2] [x2  y1] [x2  y1] ]

So, when vector_justify is TRUE, we end up with almost what we want.  Just the
x and y coordinates are sitting in each 32 bit word in reverse. (i'm omitting
dst[1] for brevity, but it works out to byteswap those, too).  This is why
adding the u_format.csv big endian entry for the very odd looking swizzle
format yxz1 worked.  It's letting the chan swizzle at the end do the swapping
for us.  I don't think it's a good idea to rely on that though. So I guess in
the cases we do a scalar fetch we need to justify and swap ?

I guess it's somewhat analogous to say

short foo[2] = [ 0x1111, 0x2222 ];
int32_t bar = *(int32_t *) foo

where you normally would need to swap on bigendian if you want to keep 0x1111
first/least significant.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20170504/e82bc59b/attachment-0001.html>