[Mesa-dev] [PATCH] i965: skip reading clip distances from the URB for the FS if possible
Iago Toral
itoral at igalia.com
Wed Sep 27 06:07:45 UTC 2017
Sure, I can do that.
Iago
On Tue, 2017-09-26 at 06:32 -0400, Ilia Mirkin wrote:
> Perhaps a debug message would be warranted in such a situation? I
> suspect it would be difficult to debug, esp if it came up in a
> regular application.
> On Sep 26, 2017 3:50 AM, "Iago Toral Quiroga" <itoral at igalia.com>
> wrote:
> we can skip these slots when they are not read in the fragment shader
>
> and they are positioned right after a VUE header that we are already
>
> skipping. We also need to ensure that we are passing at least one
> other
>
> varying, since that is a hardware requirement.
>
>
>
> This helps alleviate a problem introduced with 99df02ca26f61 for
>
> separate shader objects: without separate shader objects we assign
>
> locations sequentially, however, since that commit we have changed
> the
>
> method for SSO so that the VUE slot assigned depends on the number of
>
> builtin slots plus the location assigned to the varying. This fixed
>
> layout is intended to help SSO programs by avoiding on-the-fly
> recompiles
>
> when swapping out shaders, however, it also means that if a varying
> uses
>
> a large location number close to the maximum allowed by the SF/FS
> units
>
> (31), then the offset introduced by the number of builtin slots can
> push
>
> the location outside the range and trigger an assertion.
>
>
>
> This problem is affecting at least the following CTS tests for
>
> enhanced layouts:
>
>
>
> KHR-GL45.enhanced_layouts.varying_array_components
>
> KHR-GL45.enhanced_layouts.varying_array_locations
>
> KHR-GL45.enhanced_layouts.varying_components
>
> KHR-GL45.enhanced_layouts.varying_locations
>
>
>
> which use SSO and the the location layout qualifier to select such
>
> location numbers explicitly.
>
>
>
> This change helps these tests because for SSO we always have to
> include
>
> VARYING_SLOT_CLIP_DIST{0,1} even if the fragment shader is very
> unlikely
>
> to read them, so by doing this we free builtin slots from the fixed
> VUE
>
> layout and we avoid the tests to crash in this scenario.
>
>
>
> Of course, this is not a proper fix, we'd still run into problems if
> someone
>
> tries to use an explicit max location and read gl_ViewportIndex,
> gl_LayerID or
>
> gl_CullDistancein in the FS, but that would be a much less common bug
> and we
>
> can probably wait to see if anyone actually runs into that situation
> in a real
>
> world scenario before making the decision that more aggresive changes
> are
>
> required to support this without reverting 99df02ca26f61.
>
>
>
> Suggested-by: Kenneth Graunke <kenneth at whitecape.org>
>
> ---
>
> src/mesa/drivers/dri/i965/genX_state_upload.c | 25
> +++++++++++++++++++++++++
>
> 1 file changed, 25 insertions(+)
>
>
>
> diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c
> b/src/mesa/drivers/dri/i965/genX_state_upload.c
>
> index 612761601a..ce3515067f 100644
>
> --- a/src/mesa/drivers/dri/i965/genX_state_upload.c
>
> +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
>
> @@ -1040,6 +1040,31 @@ genX(calculate_attr_overrides)(const struct
> brw_context *brw,
>
>
>
> *urb_entry_read_offset = fs_needs_vue_header ? 0 : 1;
>
>
>
> + /* If we are already skipping the first 2 slots for the VUE
> header then we
>
> + * can also skip clip distances if they are located right after
> the header
>
> + * and the FS doesn't read them. Only do this is there are any
> user varyings
>
> + * though, since the hardware requires that we pass at least one
> varying
>
> + * between stages.
>
> + */
>
> + uint64_t var_bits = ~(VARYING_BIT_VAR(0) - 1);
>
> + uint64_t clip_dist_slots = VARYING_BIT_CLIP_DIST0 |
> VARYING_BIT_CLIP_DIST1;
>
> + if (!fs_needs_vue_header &&
>
> + var_bits &&
>
> + (brw->fragment_program->info.inputs_read & clip_dist_slots)
> == 0 &&
>
> + (brw->vue_map_geom_out.slots_valid & clip_dist_slots)) {
>
> +
>
> + uint32_t clip_dist0_slot =
>
> + brw-
> >vue_map_geom_out.varying_to_slot[VARYING_SLOT_CLIP_DIST0];
>
> +
>
> + uint32_t clip_dist1_slot =
>
> + brw-
> >vue_map_geom_out.varying_to_slot[VARYING_SLOT_CLIP_DIST1];
>
> +
>
> + if (clip_dist0_slot >= 2 && clip_dist0_slot <= 3 &&
>
> + clip_dist1_slot >= 2 && clip_dist1_slot <= 3) {
>
> + *urb_entry_read_offset = 2;
>
> + }
>
> + }
>
> +
>
> /* From the Ivybridge PRM, Vol 2 Part 1, 3DSTATE_SBE,
>
> * description of dw10 Point Sprite Texture Coordinate Enable:
>
> *
>
> --
>
> 2.11.0
>
>
>
> _______________________________________________
>
> mesa-dev mailing list
>
> mesa-dev at lists.freedesktop.org
>
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20170927/35e80d30/attachment-0001.html>
More information about the mesa-dev
mailing list