[Mesa-dev] [PATCH v3 3/3] i965/nir/fs: Implement new barrier functions for compute shaders
Jordan Justen
jordan.l.justen at intel.com
Fri Nov 6 13:44:14 PST 2015
On 2015-11-06 04:48:00, Francisco Jerez wrote:
> Jordan Justen <jordan.l.justen at intel.com> writes:
>
> > For these nir intrinsics, we emit the same code as
> > nir_intrinsic_memory_barrier:
> >
> > * nir_intrinsic_memory_barrier_atomic_counter
> > * nir_intrinsic_memory_barrier_buffer
> > * nir_intrinsic_memory_barrier_image
> >
> > We treat these nir intrinsics as no-ops:
> > * nir_intrinsic_group_memory_barrier
> > * nir_intrinsic_memory_barrier_shared
> >
> > v3:
> > * Add comment for no-op cases (curro)
> >
> > Signed-off-by: Jordan Justen <jordan.l.justen at intel.com>
> > Cc: Francisco Jerez <currojerez at riseup.net>
> > ---
> > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 11 +++++++++++
> > 1 file changed, 11 insertions(+)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> > index b6f4c52..3b3bc67 100644
> > --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> > @@ -1697,6 +1697,9 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
> > break;
> > }
> >
> > + case nir_intrinsic_memory_barrier_atomic_counter:
> > + case nir_intrinsic_memory_barrier_buffer:
> > + case nir_intrinsic_memory_barrier_image:
> > case nir_intrinsic_memory_barrier: {
> > const fs_reg tmp = bld.vgrf(BRW_REGISTER_TYPE_UD, 16 / dispatch_width);
> > bld.emit(SHADER_OPCODE_MEMORY_FENCE, tmp)
> > @@ -1704,6 +1707,14 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
> > break;
> > }
> >
> > + case nir_intrinsic_group_memory_barrier:
> > + case nir_intrinsic_memory_barrier_shared:
> > + /* We treat these single workgroup level barriers as no-ops. This is
> > + * safe today because we don't allow memory functions to be re-ordered
> > + * and we only execute programs on a single sub-slice.
> > + */
>
> You forgot to mention the fault-and-stream thing, which is going to be a
> problem already on Gen9.5. How about the following:
>
> | /* We treat these workgroup-level barriers as no-ops. This should be
> | * safe at present and as long as:
> | *
> | * - Memory access instructions are not subsequently reordered by the
> | * compiler back-end.
> | *
> | * - All threads from a given compute shader workgroup fit within a
> | * single subslice and therefore talk to the same HDC shared unit
> | * what supposedly guarantees ordering and coherency between threads
> | * from the same workgroup. This may change in the future when we
> | * start splitting workgroups across multiple subslices.
> | *
> | * - The context is not in fault-and-stream mode, which could cause
> | * memory transactions (including to SLM) prior to the barrier to be
> | * replayed after the barrier if a pagefault occurs. This shouldn't
> | * be a problem up to and including SKL because fault-and-stream is
> | * not usable due to hardware issues, but that's likely to change in
> | * the future.
> | */
>
> If you use that comment instead this patch is:
>
> Reviewed-by: Francisco Jerez <currojerez at riseup.net>
Ok, but I think I'll move the comment to a separate patch authored by
you.
Thanks,
-Jordan
>
>
> > + break;
> > +
> > case nir_intrinsic_shader_clock: {
> > /* We cannot do anything if there is an event, so ignore it for now */
> > fs_reg shader_clock = get_timestamp(bld);
> > --
> > 2.6.2
More information about the mesa-dev
mailing list