[Bug 89597] Implement SSBOs in GLSL front-end and i965

Thu Apr 16 09:19:06 PDT 2015

https://bugs.freedesktop.org/show_bug.cgi?id=89597

--- Comment #14 from Jason Ekstrand <jason at jlekstrand.net> ---
(In reply to Iago Toral from comment #13)
> Jason, scattered writes did fix the problem, thanks!
> 
> I noticed an unexpected behavior though, according to the PRM, the scattered
> write message is supposed to write 8 DWords at 8 offsets (for a block size
> of 8), however, for me it only writes 4. It completely ignores offsets
> stored in M1.4:M1.7 and data stored in M2.4:M2.7 of the message payload.

I'm not sure if this is your problem, but something that took me by surprised
about the scattered read/write messages is that they don't do what you might
first expect.  The 8 dwords are written to the 8 different offsets provided. 
This means that, if all 8 offsets are the same, one of those 8 values will end
up there and the other 7 won't get written at all.  If you want to use it (as I
did to spill an entire register), you have to give it 8 different offsets.  I
did this using an add with a vector int immediate:

http://cgit.freedesktop.org/~jekstrand/mesa/tree/src/mesa/drivers/dri/i965/brw_fs.cpp?h=wip/fs-indirects-v0.5#n1740

For SSBO's, however, scattered read/write should be exactly what you want
because because you get an offset per SIMD channel and you just have to put the
data there.  The user is responsible for making sure that data from different
fragments or vertices end up in different locations.

> This issue actually works great for me here because a vector type is at most
> 4 elements so we want to write 4 DWords tops with each message, but I wonder
> why this this happening  and if it is safe to assume that it is going to
> write 4 Dwords always. The PRM says that the hardware uses the 8 lower bits
> of the execution mask to select which of the 8 channels are effectively
> written, so I wonder if that could be affecting here or if this issue might
> be related to something else.
> 
> Any thoughts?

I'm confused.  Are you trying to use dword scattered read/write for vec4?  In
SIMD8, you only write one float at a time anyway.  Unless, of course, I'm
massively misunderstanding SSBO's.  For vec4, I think you want the 

> This is important because if I can't be sure that only 4 Dwords are going to
> be written then I need to disable the writes from offsets M1.4:M1.7. Ideally
> I would do this by altering the execution mask for the SEND instruction so
> that it only considers the the channels we want to write. Is this possible?
> I have not found any examples in the driver where this is done.
> 
> Alternatively, I could replicate the writes from offsets 0..3 into 4..7 (the
> PRM says that the hardware optimizes writes to the same offset so this may
> not be that bad).

If you're trying to use scattered read/write in vec4, then you may be running
into execution mask issues.  I don't know how the execution mask in 4x2 is laid
out but scattered read/write is usually a SIMD8 message.  It can be used in 4x2
mode but you'll have to monkey with the writemask yourself.  I'm not sure how
you do that.  Ken would know.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/intel-3d-bugs/attachments/20150416/4d826a4e/attachment.html>