<html> <head> <base href="https://bugs.freedesktop.org/" /> </head> <body> <div> <a class="bz_bug_link bz_status_NEW " title="NEW - Implement SSBOs in GLSL front-end and i965" href="https://bugs.freedesktop.org/show_bug.cgi?id=89597#c17">Comment # 17</a> on <a class="bz_bug_link bz_status_NEW " title="NEW - Implement SSBOs in GLSL front-end and i965" href="https://bugs.freedesktop.org/show_bug.cgi?id=89597">bug 89597</a> from <a class="email" href="mailto:itoral@igalia.com" title="Iago Toral <itoral@igalia.com>"> Iago Toral</a> <pre>(In reply to Jason Ekstrand from <a href="show_bug.cgi?id=89597#c16">comment #16</a>) > (In reply to Iago Toral from <a href="show_bug.cgi?id=89597#c15">comment #15</a>) > > (In reply to Jason Ekstrand from <a href="show_bug.cgi?id=89597#c14">comment #14</a>) > > > I'm confused. Are you trying to use dword scattered read/write for vec4? > > > In SIMD8, you only write one float at a time anyway. Unless, of course, I'm > > > massively misunderstanding SSBO's. For vec4, I think you want the > > > > Nope, this is SIMD8/16, haven't tried to use this with vec4. The thing is, > > imagine that I have a vector type at the IR element with element count > 1. > > Initially I would loop through the elements and write each one individually > > by passing offset(value_reg, i) as src to the write message, but then I > > noticed that I could use the same message to write all the elements in the > > vector (up to 4) in one go if I provided 4 different offsets to the > > scattered message and prepared the message payload with the 4 floats to > > write at each offset. That is, I do something like this in the visitor: > > > > /* Prepare scattered write message payload. > > * M1.0..M1.3: Dword offsets to be added to the global offset > > * M2.0..M2.3: Dword values > > */ > > int base_mrf = 1; > > for (int i = 0; i < ir->val->type->vector_elements; i++) { > > int component_mask = 1 << i; > > if (ir->write_mask & component_mask) { > > fs_reg mrf = fs_reg(MRF, base_mrf + 1, BRW_REGISTER_TYPE_UD); > > mrf.subreg_offset += i * type_sz(mrf.type); > > emit(MOV(mrf, brw_imm_ud(i))); > > > > mrf = fs_reg(MRF, base_mrf + 2, val_reg.type); > > mrf.subreg_offset += i * type_sz(mrf.type); > > emit(MOV(mrf, offset(val_reg, i))); > > } > > } > > > > /* Set the writemask so we only write to the offsets we want */ > > struct brw_reg brw_dst = > > brw_set_writemask(brw_vec8_grf(0, 0), ir->write_mask); > > fs_reg push_dst = fs_reg(brw_dst); > > fs_inst *inst = > > new(mem_ctx) fs_inst(SHADER_OPCODE_SCATTERED_BUFFER_STORE, 8, > > push_dst, surf_index, offset_reg); > > > > This seems to work well, and for vectors I end up only needing one message > > to write all the channels I need to write. Now that I think about it, the > > reason I only get 4 channels written at most is probably because > > ir->write_mask can be 0xf at most, I imagine that in SIMD8 the wridst temask > > would have to be 0xff to cover all 8 channels, unlike vec4. > > I think you are misunderstanding how these SIMD8/16 write messages work. > I'll assume 8 in the following discussion but it all applies to 16. > > As the shader executes, it is executes 8 pixels at a time. Each > sub-register represents the same symbolic value in GLSL but for a different > pixel. Suppose I have an SSBO declared as follows: > > buffer Block { > vec4 s[128]; > }; > > And suppose I execute the line of code "s[i].xzw = foo;" where foo is some > vec3. When the SIMD8 shader reaches this line, it stores 12 values in the > SSBO; 3 per pixel. If the client doesn't want the values to stomp on each > other, it is up to the client to ensure that i is different for each pixel. > > How does this work with the scattered read/write messages? They are > designed for exactly a case like this. When you get to this statement, you > will have one register that holds the value of i and three more for foo. > Each of these registers has 8 sub-registers one for each SIMD channel (or > pixel). All you should have to do is build 3 messages each one of which is > i + some math for the address part and a component of foo for the payload > part. Each scattered write writes 8 values but they are the different > values from the different SIMD channels, not from different components of > foo. The first one will write all 8 of the s[i].x, the next one s[i].y, etc. > > Does that make more sense? It does, thanks for the detailed explanation! I'll revert the implementation to what I had before then. I suppose what I have now works simply because all pixels are writing the same value to the same offset... > > > If you're trying to use scattered read/write in vec4, then you may be > > > running into execution mask issues. I don't know how the execution mask in > > > 4x2 is laid out but scattered read/write is usually a SIMD8 message. It can > > > be used in 4x2 mode but you'll have to monkey with the writemask yourself. > > > I'm not sure how you do that. Ken would know. > > > > Haven't tried this for vec4 yet, but if I end up needing it there too I'll > > ask Ken. > > For vec4, the oword messages are *probably* what you want, but I'm not sure > how that plays with packing. That said, I think it's probably best to get > this working for the FS backend as it's a good deal simpler there. It also > allows us to enable it on BDW+ before you get it working in vec4. Sure, will focus on that first. Thanks again Jason!</pre> </div> <hr> You are receiving this mail because: <ul> <li>You are the QA Contact for the bug.</li> </ul> </body> </html>