<div dir="ltr">On 5 June 2013 15:50, Chris Forbes <span dir="ltr"><<a href="mailto:chrisf@ijw.co.nz" target="_blank">chrisf@ijw.co.nz</a>></span> wrote:<br><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
Adds proper support for gl_ClipVertex when clipping against user<br>
clip planes.<br>
<br>
Fixes broken rendering in Source games on at least ILK [Water in<br>
CS:S video stress test is an easy example].<br>
<br>
Also all clip-vertex piglits pass now.<br>
<br>
This is a bit of a stopgap thing since I really want to add GLSL-1.30<br>
support in the clip shader, but this fixes things we're already<br>
claiming to support so it's a step forward.<br>
<br>
-- Chris<br></blockquote><div><br></div><div>Thanks for your hard work on this.<br><br>BTW, if you're fishing for other improvements to make on Gen5, you might consider trying to streamline its VUE map. Currently on Gen5 it's:<br>
<br></div><div>slot 0: PSIZ and flags<br></div><div>slot 1: NDC<br></div><div>slot 2: POS<br></div><div>slot 3: clip distance 0 (unused slot)<br></div><div>slot 4: clip distance 1 (unused slot)<br></div><div>slot 5: unused slot<br>
</div><div>slot 6: POS (duplicate of slot 2)<br></div><div>slots 7+: all other varyings<br></div><div><br></div><div>That's four slots of wasted space, and we have decent evidence from previous experiments* that wasting VUE space carries a modest performance cost. I think it could probably be safely shrunk down to match Gen4's arrangement, which is:<br>
<br></div><div>slot 0: PSIZ and flags<br></div><div>slot 1: NDC<br></div><div>slot 2: POS<br>slots 3+: all other varyings<br><br></div><div>I think the reason for the wasted space is that there was some confusion on the part of the folks who wrote the initial Gen5 support in the driver: they thought that certain VUE locations had to be reserved for clip distances. But that isn't the case until Gen6+ (and even in Gen6+, the locations only need to be reserved if user clipping is actually enabled). Note that this will require changing the SF thread's URB entry read offset to 1 (see brw_sf_compute_urb_entry_read_offset()).<br>
<br></div><div>An additional advantage of tackling this is that it probably will make it easier to implement gl_ClipDistance for Gen4-5, since you won't have to worry about differences between their VUE layouts.<br><br>
</div><div>(*The evidence that wasting VUE space carries a modest performance cost is that when we implemented varying packing, we observed improved performance even in cases where the shaders required extra instructions to do the packing/unpacking. Our best explanation of this was that by packing varyings, we were able to make the VUE map smaller, which meant that the GPU had to copy around less data per vertex).<br>
</div></div></div></div>