<p dir="ltr"><br>
On Jul 29, 2015 3:12 AM, "Francisco Jerez" <<a href="mailto:currojerez@riseup.net">currojerez@riseup.net</a>> wrote:<br>
><br>
> Jason Ekstrand <<a href="mailto:jason@jlekstrand.net">jason@jlekstrand.net</a>> writes:<br>
><br>
> > On Tue, Jul 28, 2015 at 1:23 AM, Francisco Jerez <<a href="mailto:currojerez@riseup.net">currojerez@riseup.net</a>> wrote:<br>
> >> Instead of relying on the default one.  This shouldn't lead to any<br>
> >> functional changes because DEP_RESOLVE_MOV overrides the execution<br>
> >> controls of the instruction anyway.<br>
> ><br>
> > Actually, DEP_RESOLVE_MOV calls half() on the builder which doesn't<br>
> > make any sense.  I think the pre-builder intention, based on the<br>
> > comment, was force_uncompressed and to use a SIMD8 instruction so that<br>
> > it adds as few deps as possible.  I'm not sure what it's actually<br>
> > doing now.  In any case, saying that it overrides everything isn't<br>
> > right.<br>
> ><br>
> Pre-builder it was doing the same it's doing now -- Emit an 8-wide<br>
> instruction regardless of the dispatch width of the shader.<br>
> fs_builder::half(0) is an alias for ::group(8, 0) which simply selects<br>
> the first 8-channel group of the channel enables and will cause the<br>
> instruction to be uncompressed during code generation.<br>
><br>
> You're right that it doesn't override execution controls of the<br>
> instruction other than exec_size, but that's because it doesn't need to.<br>
> force_writemask_all and force_sechalf are fully irrelevant for what the<br>
> emitted MOV is intended to do: stall the pipeline until the instruction<br>
> that was writing the GRF provided as argument retires, which is going to<br>
> happen regardless of the EMask of the MOV instruction (even if it's zero<br>
> it should cause a stall because pre-IVB hardware didn't implement<br>
> shoot-down of instructions with zero EMask).</p>
<p dir="ltr">Then let's change the commit message to say that the execution controls don't matter.  With that, you can have my R-B on the last two.</p>
<p dir="ltr">Is Tue SIMD stuff unblocked now?<br>
--Jason</p>
<p dir="ltr">> >> ---<br>
> >>  src/mesa/drivers/dri/i965/brw_fs.cpp | 11 +++++++----<br>
> >>  1 file changed, 7 insertions(+), 4 deletions(-)<br>
> >><br>
> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp<br>
> >> index 71d372c..8bc9372 100644<br>
> >> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp<br>
> >> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp<br>
> >> @@ -2827,7 +2827,8 @@ fs_visitor::insert_gen4_pre_send_dependency_workarounds(bblock_t *block,<br>
> >>        if (block->start() == scan_inst) {<br>
> >>           for (int i = 0; i < write_len; i++) {<br>
> >>              if (needs_dep[i])<br>
> >> -               DEP_RESOLVE_MOV(<a href="http://bld.at">bld.at</a>(block, inst), first_write_grf + i);<br>
> >> +               DEP_RESOLVE_MOV(fs_builder(this, block, inst),<br>
> >> +                               first_write_grf + i);<br>
> >>           }<br>
> >>           return;<br>
> >>        }<br>
> >> @@ -2843,7 +2844,7 @@ fs_visitor::insert_gen4_pre_send_dependency_workarounds(bblock_t *block,<br>
> >>              if (reg >= first_write_grf &&<br>
> >>                  reg < first_write_grf + write_len &&<br>
> >>                  needs_dep[reg - first_write_grf]) {<br>
> >> -               DEP_RESOLVE_MOV(<a href="http://bld.at">bld.at</a>(block, inst), reg);<br>
> >> +               DEP_RESOLVE_MOV(fs_builder(this, block, inst), reg);<br>
> >>                 needs_dep[reg - first_write_grf] = false;<br>
> >>                 if (scan_inst->exec_size == 16)<br>
> >>                    needs_dep[reg - first_write_grf + 1] = false;<br>
> >> @@ -2890,7 +2891,8 @@ fs_visitor::insert_gen4_post_send_dependency_workarounds(bblock_t *block, fs_ins<br>
> >>        if (block->end() == scan_inst) {<br>
> >>           for (int i = 0; i < write_len; i++) {<br>
> >>              if (needs_dep[i])<br>
> >> -               DEP_RESOLVE_MOV(<a href="http://bld.at">bld.at</a>(block, scan_inst), first_write_grf + i);<br>
> >> +               DEP_RESOLVE_MOV(fs_builder(this, block, scan_inst),<br>
> >> +                               first_write_grf + i);<br>
> >>           }<br>
> >>           return;<br>
> >>        }<br>
> >> @@ -2905,7 +2907,8 @@ fs_visitor::insert_gen4_post_send_dependency_workarounds(bblock_t *block, fs_ins<br>
> >>            scan_inst->dst.reg >= first_write_grf &&<br>
> >>            scan_inst->dst.reg < first_write_grf + write_len &&<br>
> >>            needs_dep[scan_inst->dst.reg - first_write_grf]) {<br>
> >> -         DEP_RESOLVE_MOV(<a href="http://bld.at">bld.at</a>(block, scan_inst), scan_inst->dst.reg);<br>
> >> +         DEP_RESOLVE_MOV(fs_builder(this, block, scan_inst),<br>
> >> +                         scan_inst->dst.reg);<br>
> >>           needs_dep[scan_inst->dst.reg - first_write_grf] = false;<br>
> >>        }<br>
> >><br>
> >> --<br>
> >> 2.4.6<br>
> >><br>
</p>