[Mesa-dev] [PATCH v2] anv/skylake: disable ForceThreadDispatchEnable
Jason Ekstrand
jason at jlekstrand.net
Tue Oct 16 18:24:10 UTC 2018
I've updated the comments a bit and pushed to master. Thanks for all your
debugging!
On Wed, Sep 19, 2018 at 11:21 AM Sergii Romantsov <
sergii.romantsov at gmail.com> wrote:
> On Skylake enabling of ForceThreadDispatchEnable causes gpu-hang.
>
> -v2: enabling of ForceThreadDispatchEnable is only for gen8, for
> gen9 and higher reverted enabling of PixelShaderHasUAV.
>
> CC: Jason Ekstrand <jason.ekstrand at intel.com>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107941
> Fixes: 79270d2140ec (anv: Stop setting 3DSTATE_PS_EXTRA::PixelShaderHasUAV)
> Signed-off-by: Sergii Romantsov <sergii.romantsov at globallogic.com>
> ---
> src/intel/vulkan/genX_pipeline.c | 33 ++++++++++++++++++++++++++++++++-
> 1 file changed, 32 insertions(+), 1 deletion(-)
>
> diff --git a/src/intel/vulkan/genX_pipeline.c
> b/src/intel/vulkan/genX_pipeline.c
> index 9595a71..b469270 100644
> --- a/src/intel/vulkan/genX_pipeline.c
> +++ b/src/intel/vulkan/genX_pipeline.c
> @@ -1445,7 +1445,7 @@ emit_3dstate_wm(struct anv_pipeline *pipeline,
> struct anv_subpass *subpass,
> wm.EarlyDepthStencilControl = EDSC_NORMAL;
> }
>
> -#if GEN_GEN >= 8
> +#if GEN_GEN == 8
> /* Gen8 hardware tries to compute ThreadDispatchEnable for us but
> * doesn't take into account KillPixels when no depth or stencil
> * writes are enabled. In order for occlusion queries to work
> @@ -1663,6 +1663,37 @@ emit_3dstate_ps_extra(struct anv_pipeline *pipeline,
> wm_prog_data->uses_kill;
>
> #if GEN_GEN >= 9
> + /* The stricter cross-primitive coherency guarantees that the
> hardware
> + * gives us with the "Accesses UAV" bit set for at least one shader
> stage
> + * and the "UAV coherency required" bit set on the 3DPRIMITIVE
> command are
> + * redundant within the current image, atomic counter and SSBO GL
> APIs,
> + * which all have very loose ordering and coherency requirements and
> + * generally rely on the application to insert explicit barriers
> when a
> + * shader invocation is expected to see the memory writes performed
> by the
> + * invocations of some previous primitive. Regardless of the value
> of
> + * "UAV coherency required", the "Accesses UAV" bits will
> implicitly cause
> + * an in most cases useless DC flush when the lowermost stage with
> the bit
> + * set finishes execution.
> + *
> + * It would be nice to disable it, but in some cases we can't
> because on
> + * Gen8+ it also has an influence on rasterization via the PS
> UAV-only
> + * signal (which could be set independently from the coherency
> mechanism
> + * in the 3DSTATE_WM command on Gen7), and because in some cases it
> will
> + * determine whether the hardware skips execution of the fragment
> shader
> + * or not via the ThreadDispatchEnable signal. However if we know
> that
> + * GEN8_PS_BLEND_HAS_WRITEABLE_RT is going to be set and
> + * GEN8_PSX_PIXEL_SHADER_NO_RT_WRITE is not set it shouldn't make
> any
> + * difference so we may just disable it here.
> + *
> + * Gen8 hardware tries to compute ThreadDispatchEnable for us but
> doesn't
> + * take into account KillPixels when no depth or stencil writes are
> + * enabled. In order for occlusion queries to work correctly with no
> + * attachments, we need to force-enable here.
> + */
> + if ((wm_prog_data->has_side_effects || wm_prog_data->uses_kill) &&
> + !has_color_buffer_write_enabled(pipeline, blend))
> + ps.PixelShaderHasUAV = true;
> +
> ps.PixelShaderComputesStencil = wm_prog_data->computed_stencil;
> ps.PixelShaderPullsBary = wm_prog_data->pulls_bary;
>
> --
> 2.7.4
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20181016/bcc43856/attachment-0001.html>
More information about the mesa-dev
mailing list