[Mesa-dev] [PATCH 00/15] i965/gen6+: Support 128 varying components.

Paul Berry stereotype441 at gmail.com
Mon Sep 9 10:17:12 PDT 2013


On 9 September 2013 09:51, Ian Romanick <idr at freedesktop.org> wrote:

> On 09/03/2013 06:18 PM, Paul Berry wrote:
> > GL 3.2 requires us to support 128 varying components for geometry
> > shader outputs and fragment shader inputs, and 64 varying components
> > otherwise.  But there's no hardware limitation that restricts us to 64
> > varying components, and core Mesa doesn't currently allow different
> > stages to have different maximum values, so I've gone ahead and
> > enabled 128 varying components for all stages.  This has the advantage
>
> I was just looking at this today while working on the standalone
> compiler.  To use the standalone compiler for shader validation, we want
> to advertise the minimums required by the spec.  To do that, we need to
> be able to track the input/output limits separately.  Since the varying
> limit changed from 64 to 60, but the vertex shader output limit is still
> 64 (where gl_Position is counted?), this may be annoying to implement
> fully.
>
> For the standalone compiler work, I'll add some of this plumbing.  That
> may impact some of your changes, depending on the order things land.
> Since my patches depend on Ken's built-in rework, yours will almost
> surely go first.
>
> > of increased test coverage, since piglit already has a number of tests
> > to validate that the maximum advertised number of varying components
> > can be exchanged between VS and FS.  I've also gone ahead and
> > increased the limit for gen6 as well as gen7, since it required very
> > little extra work.
> >
> > Previously, on gen6+, we relied on the SF/SBE stage of the pipeline to
> > reorder the outputs from the GS (or VS) to match the input ordering
> > required by the FS.  This allowed us to determine the order of FS
> > inputs solely based on the FS, so we avoided recompiles when separate
> > shader objects were in use.  But there's a problem with that: the
> > SF/SBE stage can't arbitrarily reorder more than 16 VUE slots (1 slot
> > = 4 varying components).  To avoid introducing additional recompiles
> > with previously-supported shaders, I've taken a hybrid approach to
> > choosing the FS input ordering: if the FS uses 16 or fewer input
> > varying slots, then it orders them solely based on its own
> > requirements.  If it uses more than 16 input varying slots, then it
> > orders them according to the GS (or VS) output VUE map, so that the
> > SF/SBE stage doesn't have to do any reordering.
> >
> > Patches 1-3 modify the FS so that it exposes the order of input
> > varyings it needs via prog_data.
> >
> > Patches 4-6 modify the SF/SBE setup so that it consults the FS
> > prog_data when choosing how to re-order varyings (previously, it
> > implicitly assumed an order that happened to match the order the FS
> > was using).
> >
> > Patch 7 is a minor optimization made possible by patches 1-6: now that
> > the SF/SBE setup no longer makes implicit assumptions about the order
> > of the FS inputs, the FS no longer has to have dummy input slots for
> > gl_FragCoord and gl_FrontFacing.
>
> \o/
>
> > Patch 8 tweaks the VUE map slightly so that it is uniquely determined
> > by a single 64-bit bitfield.  This will allow us to store the bitfield
> > in the FS program key rather than the entire VUE map.
> >
> > Patch 9 is a minor optimization made possible by patch 8: now that the
> > VUE map is uniquely determined by a single 64-bit bitfield, we no
> > longer have to store the entire VUE map in the GS program key.
> >
> > Patches 10-11 modify the FS to order its inputs according to the GS
> > (or VS) output VUE map when there are more than 16 input slots in use.
> >
> > Patch 12 adjusts the VS and GS code so that it can output all 32
> > varyings to the VUE, even if it requires more than two URB writes to
> > do so.
> >
> > Patches 13-14 make some minor gen6-specific adjustments to allow for
> > the larger URB entries needed for 32 vayings: the Gen6 transform
> > feedback code sometimes needs to do 2 URB writes instead of 1, and an
> > incorrect assertion in the gen6 URB setup needs to be fixed.
> >
> > Patch 15 increases the value of MaxVarying from 16 to 32 for gen6+.
> >
> > The series is available on branch "increase-max-varyings" of
> > https://github.com/stereotype441/mesa.git.  I've piglit tested it on
> > gen5, gen6, and gen7.
>
> Do we have tests that use more than 16 varying vectors?  Some of the
> generated varying packing tests, right?
>

Yes, we have a number of varying packing tests that exercise this (though
they aren't generated tests, IIRC).  Also,
spec/EXT_transform_feedback/max-varyings and shaders/glsl-max-varyings.


>
> > [PATCH 01/15] i965/fs: Expose "urb_setup" as part of brw_wm_prog_data.
> > [PATCH 02/15] i965/fs: Change brw_wm_prog_data::urb_read_length to
> num_varying_inputs.
> > [PATCH 03/15] i965/fs: Consult brw_wm_prog_data::num_varying_inputs when
> setting up WM state.
> > [PATCH 04/15] i965/sf: Use BRW_SF_URB_ENTRY_READ_OFFSET rather than
> hardcoded values.
> > [PATCH 05/15] i965/sf: Consolidate common code for setting up gen6-7
> attribute overrides.
> > [PATCH 06/15] i965/sf: Consult brw_wm_prog_data when setting up SF/SBE
> state.
> > [PATCH 07/15] i965/fs: Stop wasting input attribute space on
> gl_FragCoord and gl_Frontfacing.
> > [PATCH 08/15] i965/gen6+: Remove VUE map dependency on userclip_active.
> > [PATCH 09/15] i965/gs: Stop storing an input VUE map in the GS program
> key.
> > [PATCH 10/15] i965/fs: Simplify computation of key.input_slots_valid
> during precompile.
> > [PATCH 11/15] i965/fs: When >64 input components, order them to match
> prev pipeline stage.
> > [PATCH 12/15] i965/vec4: Generate URB writes using a loop.
> > [PATCH 13/15] i965/gen6: Fix assertions on VS/GS URB size.
> > [PATCH 14/15] i965/ff_gs: Generate URB writes using a loop.
> > [PATCH 15/15] i965/gen6+: Support 128 varying components.
> > _______________________________________________
> > mesa-dev mailing list
> > mesa-dev at lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20130909/dff6eac3/attachment.html>


More information about the mesa-dev mailing list