[Bug 89597] Implement SSBOs in GLSL front-end and i965
bugzilla-daemon at freedesktop.org
bugzilla-daemon at freedesktop.org
Fri Apr 17 00:56:59 PDT 2015
https://bugs.freedesktop.org/show_bug.cgi?id=89597
--- Comment #17 from Iago Toral <itoral at igalia.com> ---
(In reply to Jason Ekstrand from comment #16)
> (In reply to Iago Toral from comment #15)
> > (In reply to Jason Ekstrand from comment #14)
> > > I'm confused. Are you trying to use dword scattered read/write for vec4?
> > > In SIMD8, you only write one float at a time anyway. Unless, of course, I'm
> > > massively misunderstanding SSBO's. For vec4, I think you want the
> >
> > Nope, this is SIMD8/16, haven't tried to use this with vec4. The thing is,
> > imagine that I have a vector type at the IR element with element count > 1.
> > Initially I would loop through the elements and write each one individually
> > by passing offset(value_reg, i) as src to the write message, but then I
> > noticed that I could use the same message to write all the elements in the
> > vector (up to 4) in one go if I provided 4 different offsets to the
> > scattered message and prepared the message payload with the 4 floats to
> > write at each offset. That is, I do something like this in the visitor:
> >
> > /* Prepare scattered write message payload.
> > * M1.0..M1.3: Dword offsets to be added to the global offset
> > * M2.0..M2.3: Dword values
> > */
> > int base_mrf = 1;
> > for (int i = 0; i < ir->val->type->vector_elements; i++) {
> > int component_mask = 1 << i;
> > if (ir->write_mask & component_mask) {
> > fs_reg mrf = fs_reg(MRF, base_mrf + 1, BRW_REGISTER_TYPE_UD);
> > mrf.subreg_offset += i * type_sz(mrf.type);
> > emit(MOV(mrf, brw_imm_ud(i)));
> >
> > mrf = fs_reg(MRF, base_mrf + 2, val_reg.type);
> > mrf.subreg_offset += i * type_sz(mrf.type);
> > emit(MOV(mrf, offset(val_reg, i)));
> > }
> > }
> >
> > /* Set the writemask so we only write to the offsets we want */
> > struct brw_reg brw_dst =
> > brw_set_writemask(brw_vec8_grf(0, 0), ir->write_mask);
> > fs_reg push_dst = fs_reg(brw_dst);
> > fs_inst *inst =
> > new(mem_ctx) fs_inst(SHADER_OPCODE_SCATTERED_BUFFER_STORE, 8,
> > push_dst, surf_index, offset_reg);
> >
> > This seems to work well, and for vectors I end up only needing one message
> > to write all the channels I need to write. Now that I think about it, the
> > reason I only get 4 channels written at most is probably because
> > ir->write_mask can be 0xf at most, I imagine that in SIMD8 the wridst temask
> > would have to be 0xff to cover all 8 channels, unlike vec4.
>
> I think you are misunderstanding how these SIMD8/16 write messages work.
> I'll assume 8 in the following discussion but it all applies to 16.
>
> As the shader executes, it is executes 8 pixels at a time. Each
> sub-register represents the same symbolic value in GLSL but for a different
> pixel. Suppose I have an SSBO declared as follows:
>
> buffer Block {
> vec4 s[128];
> };
>
> And suppose I execute the line of code "s[i].xzw = foo;" where foo is some
> vec3. When the SIMD8 shader reaches this line, it stores 12 values in the
> SSBO; 3 per pixel. If the client doesn't want the values to stomp on each
> other, it is up to the client to ensure that i is different for each pixel.
>
> How does this work with the scattered read/write messages? They are
> designed for exactly a case like this. When you get to this statement, you
> will have one register that holds the value of i and three more for foo.
> Each of these registers has 8 sub-registers one for each SIMD channel (or
> pixel). All you should have to do is build 3 messages each one of which is
> i + some math for the address part and a component of foo for the payload
> part. Each scattered write writes 8 values but they are the different
> values from the different SIMD channels, not from different components of
> foo. The first one will write all 8 of the s[i].x, the next one s[i].y, etc.
>
> Does that make more sense?
It does, thanks for the detailed explanation! I'll revert the implementation to
what I had before then. I suppose what I have now works simply because all
pixels are writing the same value to the same offset...
> > > If you're trying to use scattered read/write in vec4, then you may be
> > > running into execution mask issues. I don't know how the execution mask in
> > > 4x2 is laid out but scattered read/write is usually a SIMD8 message. It can
> > > be used in 4x2 mode but you'll have to monkey with the writemask yourself.
> > > I'm not sure how you do that. Ken would know.
> >
> > Haven't tried this for vec4 yet, but if I end up needing it there too I'll
> > ask Ken.
>
> For vec4, the oword messages are *probably* what you want, but I'm not sure
> how that plays with packing. That said, I think it's probably best to get
> this working for the FS backend as it's a good deal simpler there. It also
> allows us to enable it on BDW+ before you get it working in vec4.
Sure, will focus on that first. Thanks again Jason!
--
You are receiving this mail because:
You are the QA Contact for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/intel-3d-bugs/attachments/20150417/c9e93514/attachment-0001.html>
More information about the intel-3d-bugs
mailing list