[Mesa-dev] [PATCH 5/5] HACK: i965/ir: Test thread dispatch packing assumptions.
Jason Ekstrand
jason at jlekstrand.net
Sat Sep 17 01:21:07 UTC 2016
On Fri, Sep 16, 2016 at 5:59 PM, Francisco Jerez <currojerez at riseup.net>
wrote:
> Jason Ekstrand <jason at jlekstrand.net> writes:
>
> > On Sep 16, 2016 3:04 PM, "Francisco Jerez" <currojerez at riseup.net>
> wrote:
> >>
> >> Not intended for upstream. Should cause a GPU hang if some thread is
> >> executed with a non-contiguous dispatch mask breaking assumptions of
> >> brw_stage_has_packed_dispatch(). Doesn't cause any CTS, DEQP or
> >> Piglit regressions, while replacing brw_stage_has_packed_dispatch()
> >> with a dummy implementation that unconditionally returns true on top
> >> of this patch causes multiple GPU hangs.
> >> ---
> >> src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 17 +++++++++++++++++
> >> src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 21 +++++++++++++++++++++
> >> 2 files changed, 38 insertions(+)
> >>
> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> > b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> >> index 042203d..b3eec49 100644
> >> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> >> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> >> @@ -33,6 +33,23 @@ using namespace brw::surface_access;
> >> void
> >> fs_visitor::emit_nir_code()
> >> {
> >> + if (brw_stage_has_packed_dispatch(stage, prog_data)) {
> >
> > Mind adding "0 &&" and merging this patch so we remain aware of the
> issue,
> > keep it building, and can easily test future hardware.
> >
> I guess that would work -- An alternative that would keep the
> NIR-to-i965 pass tidier would be to assert(devinfo->gen <= 9) in
> brw_stage_has_packed_dispatch() to make sure we don't forget to do a
> full Piglit/DEQP/CTS run with this patch applied when a new generation
> is powered on. I can also add a comment with a link to this patch.
>
Can we do both? Maybe something like this instead of an assert:
if (devinfo->gen > 9) {
static bool warned = false;
if (!warned) {
fprintf(stderr, "WARNING: VMask/DMask power-of-two assumptions need
to be verified. Once verified using emit_fs_mask_pow2_check(), this
warning may be disabled for gen%d", devinfo->gen);
warned = true;
}
}
where emit_fs_mask_pow2_check() is a helper that we put the check code
into. That way it's not an assert-failure that will get instantly removed
but something that will constantly bug the bring-up person until they have
verified the assumption.
> >> + const fs_builder ubld = bld.exec_all().group(1, 0);
> >> + const fs_reg tmp = component(bld.vgrf(BRW_REGISTER_TYPE_UD), 0);
> >> + const fs_reg mask = (stage == MESA_SHADER_FRAGMENT ?
> > brw_vmask_reg() :
> >> + brw_dmask_reg());
> >> +
> >> + ubld.ADD(tmp, mask, brw_imm_ud(1));
> >> + ubld.AND(tmp, mask, tmp);
> >> +
> >> + /* This will loop forever if the dispatch mask doesn't have the
> > expected
> >> + * form '2^n-1', in which case tmp will be non-zero.
> >> + */
> >> + bld.emit(BRW_OPCODE_DO);
> >> + bld.CMP(bld.null_reg_ud(), tmp, brw_imm_ud(0),
> BRW_CONDITIONAL_NZ);
> >> + set_predicate(BRW_PREDICATE_NORMAL, bld.emit(BRW_OPCODE_WHILE));
> >> + }
> >> +
> >> /* emit the arrays used for inputs and outputs - load/store
> > intrinsics will
> >> * be converted to reads/writes of these arrays
> >> */
> >> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> > b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> >> index ba3bbdf..9f7a1f0 100644
> >> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> >> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> >> @@ -35,6 +35,27 @@ namespace brw {
> >> void
> >> vec4_visitor::emit_nir_code()
> >> {
> >> + if (brw_stage_has_packed_dispatch(stage, &prog_data->base)) {
> >> + const dst_reg tmp = writemask(dst_reg(this,
> glsl_type::uint_type),
> >> + WRITEMASK_X);
> >> + const src_reg mask =
> >> +
> > brw_swizzle(retype(stride(brw_vec4_reg(BRW_ARCHITECTURE_REGISTER_FILE,
> > BRW_ARF_STATE, 0),
> >> + 0, 4, 1),
> >> + BRW_REGISTER_TYPE_UD),
> >> + BRW_SWIZZLE_ZZZZ);
> >
> > Can we just do vec4_reg(brw_vmask_reg)?
> >
> That didn't seem to work, because both brw_dmask_reg() and
> brw_vmask_reg() return a register region that is equivalent to
> sr0.0.xxxx in Align16 addressing mode (maybe a bug of PATCH 1? It's
> unlikely to matter a lot in practice though), so the Z component where
> the DMask is stored in is not even part of the returned Align16 region.
>
Right. As I said in another patch, I'm not terribly concerned about vec4.
We've shown it works and we won't be bringing up more vec4. I'm fine with
just merging the fs version.
> >> +
> >> + emit(ADD(tmp, mask, brw_imm_ud(1)));
> >> + emit(AND(tmp, mask, src_reg(tmp)));
> >> +
> >> + /* This will loop forever if the dispatch mask doesn't have the
> > expected
> >> + * form '2^n-1', in which case tmp will be non-zero.
> >> + */
> >> + emit(BRW_OPCODE_DO);
> >> + emit(CMP(dst_null_ud(), src_reg(tmp), brw_imm_ud(0),
> >> + BRW_CONDITIONAL_NZ));
> >> + emit(BRW_OPCODE_WHILE)->predicate = BRW_PREDICATE_NORMAL;
> >> + }
> >> +
> >> if (nir->num_uniforms > 0)
> >> nir_setup_uniforms();
> >>
> >> --
> >> 2.9.0
> >>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160916/2452860c/attachment-0001.html>
More information about the mesa-dev
mailing list