[Bug 89597] Implement SSBOs in GLSL front-end and i965

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Thu May 7 08:43:48 PDT 2015


https://bugs.freedesktop.org/show_bug.cgi?id=89597

--- Comment #51 from Francisco Jerez <currojerez at riseup.net> ---
(In reply to Iago Toral from comment #50)
> Francisco, your implementation of brw_untyped_surface_write in brw_eu_emit.c
> fixes the mask to use for the dst, which in turn decides which channels are
> effectively written to memory.
> 
> For vec4 and gen7 (except haswell) it sets the writemask to WRITEMASK_X, but
> I think we want clients of this function to decide the writemask to use. For
> example, if we are writing a vec4, I want to use WRITEMASK_XYZW (even if we
> are using SIMD8 mode) to get all 4 components written directly. Fixing the
> writemask inside this function seems a bit restrictive.
> 

IVB didn't have a SIMD4x2 variant of the untyped surface write
message, so the SIMD8 one has to be used.  The writemask is
required because the dataport will reinterpret the execution mask
sent by the EU as if each bit mapped to a separate scalar
channel, just like is the case for a FS thread, so, if you set
the writemask to XYZW the dataport might end up writing *eight*
separate vec4s to memory, which is almost certainly not what you
want.

> Would you be okay with the addition of a dst parameter to this function? The
> clients (generator or visitor) would be responsible for setting the
> appropriate writemask on that dst depending on the situation.
> 

It shouldn't be necessary.  The untyped surface write message
already stores as much data as you need (2 vec4s) with the
writemask set to one component.  The problem is that the way it
expects vectors to be laid out in the payload is somewhat
annoying: each component of the argument needs to be passed in a
separate register, just like in SIMD8 mode, with the value for
the first and second channels stored in the first and fourth
32-bit elements of each register respectively.

To solve this problem I wrote some helper functions (emit_insert
and emit_extract) while implementing ARB_shader_image_load_store
which might be useful for you as starting point to convert
between the native vector layout of the EU and the vector layout
of a message payload.  See [1], but note that the branch is
somewhat outdated.

> Since we are using WRITEMASK_XYZW for all FS invocations I think in that

The writemask doesn't exist at all in Align1 instructions (which is
what the FS back-end uses).

> case we can have the generator set the dst parameter as 
> brw_writemask(brw_null_reg(), WRITEMASK_XYZW), but for vec4 the  visitor
> would do the same, but injecting the writemask it needs depending on the
> case, then the vec4 generator would just pass that dst to
> brw_untyped_surface_write. I was also thinking about passing the writemask
> as a separate parameter to the generator opcode, but it looks like generator
> opcodes can only take 3 src arguments and we are already using them, so
> passing it as part of the dst seems like the best way to go.

[1]
http://cgit.freedesktop.org/~currojerez/mesa/tree/src/mesa/drivers/dri/i965/brw_ir_surface_builder.h?h=image-load-store#n186

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/intel-3d-bugs/attachments/20150507/5a12c05d/attachment.html>


More information about the intel-3d-bugs mailing list