<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Wed, Sep 14, 2016 at 12:02 AM, Francisco Jerez <span dir="ltr"><<a href="mailto:currojerez@riseup.net" target="_blank">currojerez@riseup.net</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">Jason Ekstrand <<a href="mailto:jason@jlekstrand.net">jason@jlekstrand.net</a>> writes:<br>
<br>
> Just looking at the channel enables is not sufficient, at least not on Sky<br>
> Lake. Channels that are disabled by the sample_mask may show up in the<br>
> channel enable register as being enabled even if they are not executing.<br>
> This can cause FIND_LIVE_CHANNEL to return a channel that isn't actually<br>
> executing. In our handling of interpolateAtSample we do a clever trick<br>
> with emit_uniformize to call the interpolator once for each unique sample<br>
> id. Thanks to FIND_LIVE_CHANNEL returning a dead channel, we can get an<br>
> infinite loop which hangs the GPU.<br>
><br>
> Signed-off-by: Jason Ekstrand <<a href="mailto:jason@jlekstrand.net">jason@jlekstrand.net</a>><br>
<br>
</span>NAK, FIND_LIVE_CHANNEL returns channels from the EU execution mask by<br>
design (see the doxygen comment in brw_defines.h), which is necessary<br>
for the instruction to return a well-defined result when only helper<br>
invocations are enabled in the execution mask. Several users of the<br>
instruction are likely to be relying on this.<br></blockquote><div><br></div><div>Perhaps I need to be a bit more specific about what problem is being fixed here. It's not just that we need to take sample mask into account. It's actually a far more subtle problem. It can happen (Yes, I've seen this in practice) that a group of channels is disabled (i.e., doesn't execute, not just helper invocations) but the corresponding bits in ec0 are set to 1. Maybe this means that ec0 is still broken on Sky Lake. Maybe it just means that ec0 doesn't mean what it looks like it means. I'm not sure. What I do know is that FIND_LIVE_CHANNEL is returning a 100% dead channel.<br><br>I'm fine if this is the wrong fix. Maybe we need to just do the gen7 thing everywhere.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
And isn't interpolateAtSample supposed to give a well-defined result too<br>
when it's run from a helper invocation?<br></blockquote><div><br></div><div>Probably?<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div class="h5">
> ---<br>
> src/mesa/drivers/dri/i965/brw_<wbr>eu.h | 3 ++-<br>
> src/mesa/drivers/dri/i965/brw_<wbr>eu_emit.c | 22 +++++++++++++++-------<br>
> src/mesa/drivers/dri/i965/brw_<wbr>fs_builder.h | 3 ++-<br>
> src/mesa/drivers/dri/i965/brw_<wbr>fs_generator.cpp | 2 +-<br>
> src/mesa/drivers/dri/i965/brw_<wbr>vec4_generator.cpp | 2 +-<br>
> 5 files changed, 21 insertions(+), 11 deletions(-)<br>
><br>
> diff --git a/src/mesa/drivers/dri/i965/<wbr>brw_eu.h b/src/mesa/drivers/dri/i965/<wbr>brw_eu.h<br>
> index 3e52764..9aaab78 100644<br>
> --- a/src/mesa/drivers/dri/i965/<wbr>brw_eu.h<br>
> +++ b/src/mesa/drivers/dri/i965/<wbr>brw_eu.h<br>
> @@ -488,7 +488,8 @@ brw_pixel_interpolator_query(<wbr>struct brw_codegen *p,<br>
><br>
> void<br>
> brw_find_live_channel(struct brw_codegen *p,<br>
> - struct brw_reg dst);<br>
> + struct brw_reg dst,<br>
> + struct brw_reg sample_mask);<br>
><br>
> void<br>
> brw_broadcast(struct brw_codegen *p,<br>
> diff --git a/src/mesa/drivers/dri/i965/<wbr>brw_eu_emit.c b/src/mesa/drivers/dri/i965/<wbr>brw_eu_emit.c<br>
> index 3b12030..f593a8d 100644<br>
> --- a/src/mesa/drivers/dri/i965/<wbr>brw_eu_emit.c<br>
> +++ b/src/mesa/drivers/dri/i965/<wbr>brw_eu_emit.c<br>
> @@ -3361,7 +3361,8 @@ brw_pixel_interpolator_query(<wbr>struct brw_codegen *p,<br>
> }<br>
><br>
> void<br>
> -brw_find_live_channel(struct brw_codegen *p, struct brw_reg dst)<br>
> +brw_find_live_channel(struct brw_codegen *p, struct brw_reg dst,<br>
> + struct brw_reg sample_mask)<br>
> {<br>
> const struct gen_device_info *devinfo = p->devinfo;<br>
> const unsigned exec_size = 1 << brw_inst_exec_size(devinfo, p->current);<br>
> @@ -3377,13 +3378,20 @@ brw_find_live_channel(struct brw_codegen *p, struct brw_reg dst)<br>
><br>
> if (devinfo->gen >= 8) {<br>
> /* Getting the first active channel index is easy on Gen8: Just find<br>
> - * the first bit set in the mask register. The same register exists<br>
> - * on HSW already but it reads back as all ones when the current<br>
> - * instruction has execution masking disabled, so it's kind of<br>
> - * useless.<br>
> + * the first bit set in the mask register AND the sample mask. The<br>
> + * same register exists on HSW already but it reads back as all ones<br>
> + * when the current instruction has execution masking disabled, so<br>
> + * it's kind of useless.<br>
> */<br>
> - inst = brw_FBL(p, vec1(dst),<br>
> - retype(brw_mask_reg(0), BRW_REGISTER_TYPE_UD));<br>
> + struct brw_reg mask_reg = retype(brw_mask_reg(0),<br>
> + BRW_REGISTER_TYPE_UD);<br>
> + if (sample_mask.file != BRW_IMMEDIATE_VALUE ||<br>
> + sample_mask.ud != 0xffffffff) {<br>
> + brw_AND(p, vec1(dst), mask_reg, sample_mask);<br>
> + mask_reg = vec1(dst);<br>
> + }<br>
> +<br>
> + inst = brw_FBL(p, vec1(dst), mask_reg);<br>
><br>
> /* Quarter control has the effect of magically shifting the value of<br>
> * this register so you'll get the first active channel relative to<br>
> diff --git a/src/mesa/drivers/dri/i965/<wbr>brw_fs_builder.h b/src/mesa/drivers/dri/i965/<wbr>brw_fs_builder.h<br>
> index 483672f..45b5f88 100644<br>
> --- a/src/mesa/drivers/dri/i965/<wbr>brw_fs_builder.h<br>
> +++ b/src/mesa/drivers/dri/i965/<wbr>brw_fs_builder.h<br>
> @@ -407,7 +407,8 @@ namespace brw {<br>
> const dst_reg chan_index = vgrf(BRW_REGISTER_TYPE_UD);<br>
> const dst_reg dst = vgrf(src.type);<br>
><br>
> - ubld.emit(SHADER_OPCODE_FIND_<wbr>LIVE_CHANNEL, chan_index);<br>
> + ubld.emit(SHADER_OPCODE_FIND_<wbr>LIVE_CHANNEL, chan_index,<br>
> + sample_mask_reg());<br>
> ubld.emit(SHADER_OPCODE_<wbr>BROADCAST, dst, src, component(chan_index, 0));<br>
><br>
> return src_reg(component(dst, 0));<br>
> diff --git a/src/mesa/drivers/dri/i965/<wbr>brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/<wbr>brw_fs_generator.cpp<br>
> index 2f4ba7b..d923b0b 100644<br>
> --- a/src/mesa/drivers/dri/i965/<wbr>brw_fs_generator.cpp<br>
> +++ b/src/mesa/drivers/dri/i965/<wbr>brw_fs_generator.cpp<br>
> @@ -2041,7 +2041,7 @@ fs_generator::generate_code(<wbr>const cfg_t *cfg, int dispatch_width)<br>
> break;<br>
><br>
> case SHADER_OPCODE_FIND_LIVE_<wbr>CHANNEL:<br>
> - brw_find_live_channel(p, dst);<br>
> + brw_find_live_channel(p, dst, src[0]);<br>
> break;<br>
><br>
> case SHADER_OPCODE_BROADCAST:<br>
> diff --git a/src/mesa/drivers/dri/i965/<wbr>brw_vec4_generator.cpp b/src/mesa/drivers/dri/i965/<wbr>brw_vec4_generator.cpp<br>
> index 256abae..63fca6f 100644<br>
> --- a/src/mesa/drivers/dri/i965/<wbr>brw_vec4_generator.cpp<br>
> +++ b/src/mesa/drivers/dri/i965/<wbr>brw_vec4_generator.cpp<br>
> @@ -1863,7 +1863,7 @@ generate_code(struct brw_codegen *p,<br>
> break;<br>
><br>
> case SHADER_OPCODE_FIND_LIVE_<wbr>CHANNEL:<br>
> - brw_find_live_channel(p, dst);<br>
> + brw_find_live_channel(p, dst, brw_null_reg());<br>
> break;<br>
><br>
> case SHADER_OPCODE_BROADCAST:<br>
> --<br>
> 2.5.0.400.gff86faf<br>
><br>
</div></div>> ______________________________<wbr>_________________<br>
> mesa-dev mailing list<br>
> <a href="mailto:mesa-dev@lists.freedesktop.org">mesa-dev@lists.freedesktop.org</a><br>
> <a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev" rel="noreferrer" target="_blank">https://lists.freedesktop.org/<wbr>mailman/listinfo/mesa-dev</a><br>
</blockquote></div><br></div></div>