[Mesa-dev] [PATCH 00/10] glsl: Implement varying packing.
Ian Romanick
idr at freedesktop.org
Thu Dec 13 03:44:28 PST 2012
Patches 1 through 4 are
Reviewed-by: Ian Romanick <ian.d.romanick at intel.com>
I'll try to make it through the rest tomorrow. I have skimmed them, and
it looks mostly okay. I thing a good follow-up patch will be to pull a
bunch of the new stuff out to a new file link_varyings.cpp or something.
linker.cpp is getting a bit out of control. ~2700 lines in one file...
On 12/11/2012 03:09 PM, Paul Berry wrote:
> This patch series adds varying packing to Mesa, so that we can handle
> varyings composed of things other than vec4's without using up extra
> varying components.
>
> For the initial implementation I've chosen a strategy that operates
> exclusively at the GLSL IR level, so that it doesn't require the
> cooperation of the driver back-ends. This means that varying packing
> should be immediately useful for all drivers. However, there are some
> types of varying packing that can't be done using GLSL IR alone (for
> example, packing a "noperspective" varying and a "smooth" varying
> together), but should be possible on some drivers with a small amount
> of back-end work. I'm deferring that work for a later patch series.
> Also, packing of floats and ints together into the same "flat varying"
> should be possible for drivers that implement
> ARB_shader_bit_encoding--I'm also deferring that for a later patch
> series.
>
> The strategy is as follows:
>
> - Before assigning locations to varyings, we sort them into "packing
> classes" based on base type and interpolation mode (this is to
> ensure that we don't try to pack floats with ints, or smooth with
> flat, for example).
>
> - Within each packing class, we sort the varyings based on the number
> of vector elements. Vec4's (as well as matrices and arrays composed
> of vec4's) are packed first, then vec2's, then scalars, since this
> allows us to align them all to their natural alignment boundary, so
> we avoid the performance penalty of "double parking" a varying
> across two varying slots. Vec3's are packed last, double parking
> them if necessary.
>
> - For any varying slot that doesn't contain exactly one vec4, we
> generate GLSL IR to manually pack/unpack the varying in the shader.
> For instance, the following fragment shader:
>
> varying vec2 a;
> varying vec2 b;
> varying vec3 c;
> varying vec3 d;
> main()
> {
> ...
> }
>
> would get rewritten as follows:
>
> varying vec4 packed0;
> varying vec4 packed1;
> varying vec4 packed2;
> vec2 a;
> vec2 b;
> vec3 c;
> vec3 d;
> main()
> {
> a = packed0.xy;
> b = packed0.zw;
> c = packed1.xyz;
> d.x = packed1.w; // d is "double parked" across slots 1 and 2
> d.yz = packed2.xy;
> ...
> }
>
> This GLSL IR is generated by a lowering pass, so that in the future
> we will have the option of disabling it for driver back-ends that
> are capable of natively understanding the packed varying format.
>
> - Finally, the linker code to handle transform feedback is modified to
> account for varying packing (e.g. by feeding back just a subset of
> the components of a varying slot rather than the entire varying
> slot). Fortunately transform feedback already has the
> infrastructure necessary to do this, since it was needed in order to
> implement glClipDistance.
>
>
> I believe this is enough to be useful for the vast majority of
> programs, and to get us passing the GLES3 conformance tests.
>
>
> Additional improvements, which I'm planning to defer to later patch
> series, include:
>
> - Allow uints and ints to be packed together in the same varying slot.
> This should be possible on all back-ends, since ints and uints may
> be interconverted without losing information.
>
> - On back-ends that support ARB_shader_bit_encoding, allow floats and
> ints to be packed together in the same varying slot, since
> ARB_shader_bit_encoding allows floating-point values to be encoded
> into ints without losing information.
>
> - On back-ends that can mix interpolation modes within a single
> varying slot, allow additional packing, with help from the driver
> back-end. For instance, i965 gen6 and above can in principle mix
> together all interpolation modes except for "flat" within a single
> varying slot, if we do a hopefully small amount of back-end work.
>
> - Allow a driver back-end to advertise a larger number of varying
> components to the linker than it advertises to the client
> program--this will allow us to ensure that varying packing *never*
> fails. For example, on i965 gen6 and above, after the above
> improvements are made, we should be able to pack any possible
> combination of varyings with a maximum waste of 3 varying
> components. That means, for example, that if the i965 driver
> advertises 17 varying slots to the linker (== 68 varying
> components), but advertises only 64 varying components to the the
> client program, then varying packing will always succeed.
>
> Note: I also have a new piglit test that exercises this code; I'll be
> publishing that to the Piglit list ASAP.
>
> [PATCH 01/10] glsl/lower_clip_distance: Update symbol table.
> [PATCH 02/10] glsl/linker: Always invalidate shader ins/outs, even in corner cases.
> [PATCH 03/10] glsl/linker: Make separate ir_variable field to mean "unmatched".
> [PATCH 04/10] glsl: Create a field to store fractional varying locations.
> [PATCH 05/10] glsl/linker: Defer recording transform feedback locations.
> [PATCH 06/10] glsl/linker: Subdivide the first phase of varying assignment.
> [PATCH 07/10] glsl/linker: Sort varyings by packing class, then vector size.
> [PATCH 08/10] glsl: Add a lowering pass for packing varyings.
> [PATCH 09/10] glsl/linker: Pack within compound varyings.
> [PATCH 10/10] glsl/linker: Pack between varyings.
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
More information about the mesa-dev
mailing list