[Mesa-dev] [PATCH] anv: set UAV coherence required bit when needed
Jason Ekstrand
jason at jlekstrand.net
Mon Jan 16 18:23:37 UTC 2017
Thanks for fixing this!
Reviewed-by: Jason Ekstrand <jason at jlekstrand.net>
On Mon, Jan 16, 2017 at 4:56 AM, Iago Toral Quiroga <itoral at igalia.com>
wrote:
> The same we do in the OpenGL driver (comment copied from there).
>
> This is required to ensure that we execute the fragment shader stage when
> side-effects (such as image or ssbo stores) are present but there are no
> color writes.
>
> I found this while writing a test to check rendering to a framebuffer
> without attachments where the fragment shader does not produce any
> color outputs but writes to an image via imageStore(). Without this patch
> the fragment shader does not execute and the image is not written,
> which is not correct.
> ---
> src/intel/vulkan/genX_pipeline.c | 51 ++++++++++++++++++++++++++++++
> ++++++++++
> 1 file changed, 51 insertions(+)
>
> diff --git a/src/intel/vulkan/genX_pipeline.c b/src/intel/vulkan/genX_
> pipeline.c
> index 90968b4..6a927d5 100644
> --- a/src/intel/vulkan/genX_pipeline.c
> +++ b/src/intel/vulkan/genX_pipeline.c
> @@ -1248,6 +1248,26 @@ emit_3dstate_ps(struct anv_pipeline *pipeline,
> }
> }
>
> +static inline bool
> +has_color_buffer_write_enabled(const struct anv_pipeline *pipeline)
> +{
> + const struct anv_shader_bin *shader_bin =
> + pipeline->shaders[MESA_SHADER_FRAGMENT];
> + if (!shader_bin)
> + return false;
> +
> + const struct anv_pipeline_bind_map *bind_map = &shader_bin->bind_map;
> + for (int i = 0; i < bind_map->surface_count; i++) {
> + if (bind_map->surface_to_descriptor[i].set !=
> + ANV_DESCRIPTOR_SET_COLOR_ATTACHMENTS)
> + continue;
> + if (bind_map->surface_to_descriptor[i].index != UINT8_MAX)
> + return true;
> + }
> +
> + return false;
> +}
> +
> #if GEN_GEN >= 8
> static void
> emit_3dstate_ps_extra(struct anv_pipeline *pipeline,
> @@ -1278,6 +1298,37 @@ emit_3dstate_ps_extra(struct anv_pipeline *pipeline,
> ps.PixelShaderKillsPixel = subpass->has_ds_self_dep ||
> wm_prog_data->uses_kill;
>
> + /* The stricter cross-primitive coherency guarantees that the
> hardware
> + * gives us with the "Accesses UAV" bit set for at least one shader
> stage
> + * and the "UAV coherency required" bit set on the 3DPRIMITIVE
> command are
> + * redundant within the current image, atomic counter and SSBO GL
> APIs,
> + * which all have very loose ordering and coherency requirements and
> + * generally rely on the application to insert explicit barriers
> when a
> + * shader invocation is expected to see the memory writes performed
> by the
> + * invocations of some previous primitive. Regardless of the value
> of
> + * "UAV coherency required", the "Accesses UAV" bits will
> implicitly cause
> + * an in most cases useless DC flush when the lowermost stage with
> the bit
> + * set finishes execution.
> + *
> + * It would be nice to disable it, but in some cases we can't
> because on
> + * Gen8+ it also has an influence on rasterization via the PS
> UAV-only
> + * signal (which could be set independently from the coherency
> mechanism
> + * in the 3DSTATE_WM command on Gen7), and because in some cases it
> will
> + * determine whether the hardware skips execution of the fragment
> shader
> + * or not via the ThreadDispatchEnable signal. However if we know
> that
> + * GEN8_PS_BLEND_HAS_WRITEABLE_RT is going to be set and
> + * GEN8_PSX_PIXEL_SHADER_NO_RT_WRITE is not set it shouldn't make
> any
> + * difference so we may just disable it here.
> + *
> + * Gen8 hardware tries to compute ThreadDispatchEnable for us but
> doesn't
> + * take into account KillPixels when no depth or stencil writes are
> + * enabled. In order for occlusion queries to work correctly with no
> + * attachments, we need to force-enable here.
> + */
> + if ((wm_prog_data->has_side_effects || wm_prog_data->uses_kill) &&
> + !has_color_buffer_write_enabled(pipeline))
> + ps.PixelShaderHasUAV = true;
> +
> #if GEN_GEN >= 9
> ps.PixelShaderPullsBary = wm_prog_data->pulls_bary;
> ps.InputCoverageMaskState = wm_prog_data->uses_sample_mask ?
> --
> 2.7.4
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20170116/c681883a/attachment.html>
More information about the mesa-dev
mailing list