[Mesa-dev] [PATCH 0/2] Fix gl_ClipVertex support on pre-Gen6 i965

Wed Jun 5 08:53:56 PDT 2013

On 5 June 2013 15:50, Chris Forbes <chrisf at ijw.co.nz> wrote:

> Adds proper support for gl_ClipVertex when clipping against user
> clip planes.
>
> Fixes broken rendering in Source games on at least ILK [Water in
> CS:S video stress test is an easy example].
>
> Also all clip-vertex piglits pass now.
>
> This is a bit of a stopgap thing since I really want to add GLSL-1.30
> support in the clip shader, but this fixes things we're already
> claiming to support so it's a step forward.
>
> -- Chris
>

Thanks for your hard work on this.

BTW, if you're fishing for other improvements to make on Gen5, you might
consider trying to streamline its VUE map.  Currently on Gen5 it's:

slot 0: PSIZ and flags
slot 1: NDC
slot 2: POS
slot 3: clip distance 0 (unused slot)
slot 4: clip distance 1 (unused slot)
slot 5: unused slot
slot 6: POS (duplicate of slot 2)
slots 7+: all other varyings

That's four slots of wasted space, and we have decent evidence from
previous experiments* that wasting VUE space carries a modest performance
cost.  I think it could probably be safely shrunk down to match Gen4's
arrangement, which is:

slot 0: PSIZ and flags
slot 1: NDC
slot 2: POS
slots 3+: all other varyings

I think the reason for the wasted space is that there was some confusion on
the part of the folks who wrote the initial Gen5 support in the driver:
they thought that certain VUE locations had to be reserved for clip
distances.  But that isn't the case until Gen6+ (and even in Gen6+, the
locations only need to be reserved if user clipping is actually enabled).
Note that this will require changing the SF thread's URB entry read offset
to 1 (see brw_sf_compute_urb_entry_read_offset()).

An additional advantage of tackling this is that it probably will make it
easier to implement gl_ClipDistance for Gen4-5, since you won't have to
worry about differences between their VUE layouts.

(*The evidence that wasting VUE space carries a modest performance cost is
that when we implemented varying packing, we observed improved performance
even in cases where the shaders required extra instructions to do the
packing/unpacking.  Our best explanation of this was that by packing
varyings, we were able to make the VUE map smaller, which meant that the
GPU had to copy around less data per vertex).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20130605/cca6fe79/attachment.html>