[Mesa-dev] [PATCH] i965: skip reading clip distances from the URB for the FS if possible

Iago Toral itoral at igalia.com
Wed Sep 27 06:07:45 UTC 2017


Sure, I can do that.
Iago
On Tue, 2017-09-26 at 06:32 -0400, Ilia Mirkin wrote:
> Perhaps a debug message would be warranted in such a situation? I
> suspect it would be difficult to debug, esp if it came up in a
> regular application.
> On Sep 26, 2017 3:50 AM, "Iago Toral Quiroga" <itoral at igalia.com>
> wrote:
> we can skip these slots when they are not read in the fragment shader
> 
> and they are positioned right after a VUE header that we are already
> 
> skipping. We also need to ensure that we are passing at least one
> other
> 
> varying, since that is a hardware requirement.
> 
> 
> 
> This helps alleviate a problem introduced with 99df02ca26f61 for
> 
> separate shader objects: without separate shader objects we assign
> 
> locations sequentially, however, since that commit we have changed
> the
> 
> method for SSO so that the VUE slot assigned depends on the number of
> 
> builtin slots plus the location assigned to the varying. This fixed
> 
> layout is intended to help SSO programs by avoiding on-the-fly
> recompiles
> 
> when swapping out shaders, however, it also means that if a varying
> uses
> 
> a large location number close to the maximum allowed by the SF/FS
> units
> 
> (31), then the offset introduced by the number of builtin slots can
> push
> 
> the location outside the range and trigger an assertion.
> 
> 
> 
> This problem is affecting at least the following CTS tests for
> 
> enhanced layouts:
> 
> 
> 
> KHR-GL45.enhanced_layouts.varying_array_components
> 
> KHR-GL45.enhanced_layouts.varying_array_locations
> 
> KHR-GL45.enhanced_layouts.varying_components
> 
> KHR-GL45.enhanced_layouts.varying_locations
> 
> 
> 
> which use SSO and the the location layout qualifier to select such
> 
> location numbers explicitly.
> 
> 
> 
> This change helps these tests because for SSO we always have to
> include
> 
> VARYING_SLOT_CLIP_DIST{0,1} even if the fragment shader is very
> unlikely
> 
> to read them, so by doing this we free builtin slots from the fixed
> VUE
> 
> layout and we avoid the tests to crash in this scenario.
> 
> 
> 
> Of course, this is not a proper fix, we'd still run into problems if
> someone
> 
> tries to use an explicit max location and read gl_ViewportIndex,
> gl_LayerID or
> 
> gl_CullDistancein in the FS, but that would be a much less common bug
> and we
> 
> can probably wait to see if anyone actually runs into that situation
> in a real
> 
> world scenario before making the decision that more aggresive changes
> are
> 
> required to support this without reverting 99df02ca26f61.
> 
> 
> 
> Suggested-by: Kenneth Graunke <kenneth at whitecape.org>
> 
> ---
> 
>  src/mesa/drivers/dri/i965/genX_state_upload.c | 25
> +++++++++++++++++++++++++
> 
>  1 file changed, 25 insertions(+)
> 
> 
> 
> diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c
> b/src/mesa/drivers/dri/i965/genX_state_upload.c
> 
> index 612761601a..ce3515067f 100644
> 
> --- a/src/mesa/drivers/dri/i965/genX_state_upload.c
> 
> +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
> 
> @@ -1040,6 +1040,31 @@ genX(calculate_attr_overrides)(const struct
> brw_context *brw,
> 
> 
> 
>     *urb_entry_read_offset = fs_needs_vue_header ? 0 : 1;
> 
> 
> 
> +   /* If we are already skipping the first 2 slots for the VUE
> header then we
> 
> +    * can also skip clip distances if they are located right after
> the header
> 
> +    * and the FS doesn't read them. Only do this is there are any
> user varyings
> 
> +    * though, since the hardware requires that we pass at least one
> varying
> 
> +    * between stages.
> 
> +    */
> 
> +   uint64_t var_bits = ~(VARYING_BIT_VAR(0) - 1);
> 
> +   uint64_t clip_dist_slots = VARYING_BIT_CLIP_DIST0 |
> VARYING_BIT_CLIP_DIST1;
> 
> +   if (!fs_needs_vue_header &&
> 
> +       var_bits &&
> 
> +       (brw->fragment_program->info.inputs_read & clip_dist_slots)
> == 0 &&
> 
> +       (brw->vue_map_geom_out.slots_valid & clip_dist_slots)) {
> 
> +
> 
> +      uint32_t clip_dist0_slot =
> 
> +         brw-
> >vue_map_geom_out.varying_to_slot[VARYING_SLOT_CLIP_DIST0];
> 
> +
> 
> +      uint32_t clip_dist1_slot =
> 
> +         brw-
> >vue_map_geom_out.varying_to_slot[VARYING_SLOT_CLIP_DIST1];
> 
> +
> 
> +      if (clip_dist0_slot >= 2 && clip_dist0_slot <= 3 &&
> 
> +          clip_dist1_slot >= 2 && clip_dist1_slot <= 3) {
> 
> +         *urb_entry_read_offset = 2;
> 
> +      }
> 
> +   }
> 
> +
> 
>     /* From the Ivybridge PRM, Vol 2 Part 1, 3DSTATE_SBE,
> 
>      * description of dw10 Point Sprite Texture Coordinate Enable:
> 
>      *
> 
> --
> 
> 2.11.0
> 
> 
> 
> _______________________________________________
> 
> mesa-dev mailing list
> 
> mesa-dev at lists.freedesktop.org
> 
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
> 
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20170927/35e80d30/attachment-0001.html>


More information about the mesa-dev mailing list