<div dir="ltr">On 21 October 2013 17:48, Eric Anholt <span dir="ltr"><<a href="mailto:eric@anholt.net" target="_blank">eric@anholt.net</a>></span> wrote:<br><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> Things blew up when I enabled the debug register spill code without<br> disabling 16-wide, so I decided to just fix 16-wide spilling.<br> <br> We still don't generate 16-wide when register spilling happens as part of<br> allocation (since we expect it to be slower), but now we can experiment<br> with allowing it in some cases in the future.<br></blockquote><div><br></div><div>Also, we'll need the ability to spill in 16-wide when we get to doing compute shaders, because there are certain conditions in compute shaders where SIMD8 isn't an option (basically, if a compute shader uses barriers or shared variables, all the threads within a thread group need to run simultaneously on one subslice; with the maximum possible size thread group (1024), that's only possible if we use SIMD16. See Graphics BSpec: 3D-Media-GPGPU Engine > Media GPGPU Pipeline > Media GPGPU Pipeline > Programming the GPGPU Pipeline IVB+ > GPGPU Thread Limits [DevHSW+]).<br> </div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> ---<br> src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 8 ++++----<br> src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp | 15 ++++++++-------<br> 2 files changed, 12 insertions(+), 11 deletions(-)<br> <br> diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp<br> index fa15f7b..6c8fb76 100644<br> --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp<br> +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp<br> @@ -718,8 +718,8 @@ fs_generator::generate_spill(fs_inst *inst, struct brw_reg src)<br> brw_MOV(p,<br> retype(brw_message_reg(inst->base_mrf + 1), BRW_REGISTER_TYPE_UD),<br> retype(src, BRW_REGISTER_TYPE_UD));<br> - brw_oword_block_write_scratch(p, brw_message_reg(inst->base_mrf), 1,<br> - inst->offset);<br> + brw_oword_block_write_scratch(p, brw_message_reg(inst->base_mrf),<br> + inst->mlen, inst->offset);<br> }<br> <br> void<br> @@ -727,8 +727,8 @@ fs_generator::generate_unspill(fs_inst *inst, struct brw_reg dst)<br> {<br> assert(inst->mlen != 0);<br> <br> - brw_oword_block_read_scratch(p, dst, brw_message_reg(inst->base_mrf), 1,<br> - inst->offset);<br> + brw_oword_block_read_scratch(p, dst, brw_message_reg(inst->base_mrf),<br> + dispatch_width / 8, inst->offset);<br> }<br> <br> void<br> diff --git a/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp b/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp<br> index 7826cd4..ed0ce0d 100644<br> --- a/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp<br> +++ b/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp<br> @@ -540,7 +540,7 @@ fs_visitor::emit_unspill(fs_inst *inst, fs_reg dst, uint32_t spill_offset,<br> inst->insert_before(unspill_inst);<br> <br> dst.reg_offset++;<br></blockquote><div><br></div>I had to do a lot of digging to reassure myself that this didn't need to change to "dst.reg_offset += dispatch_width / 8;". Would you mind clarifying the comment above the declaration of reg_offset in brw_fs.h? Currently it says:<br> <br> /**<br> * For virtual registers, this is a hardware register offset from<br> * the start of the register block (for example, a constant index<br> * in an array access).<br> */<br><br></div><div class="gmail_quote"> which seems to imply that it's measured in increments of a single 256-bit register. But it appears that for virtual registers, it's actually in increments of (dispatch_width/8) 256-bit registers.<br></div><div class="gmail_quote"> <div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> - spill_offset += REG_SIZE;<br> + spill_offset += dispatch_width * sizeof(float);<br> }<br> }<br> <br> @@ -624,10 +624,11 @@ fs_visitor::choose_spill_reg(struct ra_graph *g)<br> void<br> fs_visitor::spill_reg(int spill_reg)<br> {<br> + int reg_size = dispatch_width * sizeof(float);<br> int size = virtual_grf_sizes[spill_reg];<br> unsigned int spill_offset = c->last_scratch;<br> assert(ALIGN(spill_offset, 16) == spill_offset); /* oword read/write req. */<br> - c->last_scratch += size * REG_SIZE;<br> + c->last_scratch += size * reg_size;<br> <br> /* Generate spill/unspill instructions for the objects being<br> * spilled. Right now, we spill or unspill the whole thing to a<br> @@ -646,7 +647,7 @@ fs_visitor::spill_reg(int spill_reg)<br> inst->src[i].reg_offset = 0;<br> <br> emit_unspill(inst, inst->src[i],<br> - spill_offset + REG_SIZE * inst->src[i].reg_offset,<br> + spill_offset + reg_size * inst->src[i].reg_offset,<br> regs_read);<br> }<br> }<br> @@ -654,7 +655,7 @@ fs_visitor::spill_reg(int spill_reg)<br> if (inst->dst.file == GRF &&<br> inst->dst.reg == spill_reg) {<br> int subset_spill_offset = (spill_offset +<br> - REG_SIZE * inst->dst.reg_offset);<br> + reg_size * inst->dst.reg_offset);<br> inst->dst.reg = virtual_grf_alloc(inst->regs_written);<br> inst->dst.reg_offset = 0;<br> <br> @@ -677,11 +678,11 @@ fs_visitor::spill_reg(int spill_reg)<br> fs_inst *spill_inst = new(mem_ctx) fs_inst(FS_OPCODE_SPILL,<br> reg_null_f, spill_src);<br> spill_src.reg_offset++;<br> - spill_inst->offset = subset_spill_offset + chan * REG_SIZE;<br> + spill_inst->offset = subset_spill_offset + chan * reg_size;<br> spill_inst->ir = inst->ir;<br> spill_inst->annotation = inst->annotation;<br> - spill_inst->base_mrf = 14;<br> - spill_inst->mlen = 2; /* header, value */<br> + spill_inst->mlen = 1 + dispatch_width / 8; /* header, value */<br> + spill_inst->base_mrf = 16 - spill_inst->mlen;<br></blockquote><div><br></div><div>This means that for SIMD16 spills, we'll use MRFs 13-15 (previously we used MRFs 14-15). Do we know for sure that MRF 13 isn't used by any non-spill-related code? If so, would you mind adding a quick comment to clarify that?<br> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> inst->insert_after(spill_inst);<br> }<br> }<br> <span><font color="#888888">--<br> 1.8.4.rc3<br></font></span></blockquote><div><br></div><div>With those comments added, the patch is:<br><br>Reviewed-by: Paul Berry <<a href="mailto:stereotype441@gmail.com">stereotype441@gmail.com</a>><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><span><font color="#888888"> <br> _______________________________________________<br> mesa-dev mailing list<br> <a href="mailto:mesa-dev@lists.freedesktop.org" target="_blank">mesa-dev@lists.freedesktop.org</a><br> <a href="http://lists.freedesktop.org/mailman/listinfo/mesa-dev" target="_blank">http://lists.freedesktop.org/mailman/listinfo/mesa-dev</a><br> </font></span></blockquote></div><br></div></div>