[Mesa-dev] [PATCH] i965: Shrink Gen5 VUE map layout to be the same as Gen4.

Kenneth Graunke kenneth at whitecape.org
Fri Jun 7 12:06:00 PDT 2013


On 06/07/2013 11:30 AM, Paul Berry wrote:
> On 7 June 2013 03:17, Chris Forbes <chrisf at ijw.co.nz
> <mailto:chrisf at ijw.co.nz>> wrote:
>
>     The PRM suggests a larger layout, mostly to support having
>     gl_ClipDistance[] somewhere predictable for the fixed-function clipper
>     -- but it didn't actually arrive in Gen5.
>
>     Just use the same layout for both Gen4 and Gen5.
>
>     No Piglit regressions.
>
>     Improves performance in CS:S Video Stress Test by ~3%.

Fantastic!

>     Signed-off-by: Chris Forbes <chrisf at ijw.co.nz <mailto:chrisf at ijw.co.nz>>
>     ---
>       src/mesa/drivers/dri/i965/brw_sf_state.c |  5 +----
>       src/mesa/drivers/dri/i965/brw_vs.c       | 23 +++--------------------
>       2 files changed, 4 insertions(+), 24 deletions(-)
>
>     diff --git a/src/mesa/drivers/dri/i965/brw_sf_state.c
>     b/src/mesa/drivers/dri/i965/brw_sf_state.c
>     index 7c29ba2..e9b7e66 100644
>     --- a/src/mesa/drivers/dri/i965/brw_sf_state.c
>     +++ b/src/mesa/drivers/dri/i965/brw_sf_state.c
>     @@ -131,10 +131,7 @@ const struct brw_tracked_state brw_sf_vp = {
>       int
>       brw_sf_compute_urb_entry_read_offset(struct intel_context *intel)
>       {
>     -   if (intel->gen == 5)
>     -      return 3;
>     -   else
>     -      return 1;
>     +   return 1;
>       }
>
>
> How about just turning this into #define BRW_SF_URB_ENTRY_READ_OFFSET 1
> in a header somewhere?  It seems silly to have a function whose only job
> is to return a constant.

In the Gen6+ code, we just have:
    int urb_entry_read_offset = 1;

in both halves (which is kind of lame since you need to synchronize 
them), but...that's what's there.

I'd be okay with either.  I agree that the function should go away, 
either in this patch or a quick follow-up.

Reviewed-by: Kenneth Graunke <kenneth at whitecape.org>

>
>
>       static void upload_sf_unit( struct brw_context *brw )
>     diff --git a/src/mesa/drivers/dri/i965/brw_vs.c
>     b/src/mesa/drivers/dri/i965/brw_vs.c
>     index 720325d..d173d2e 100644
>     --- a/src/mesa/drivers/dri/i965/brw_vs.c
>     +++ b/src/mesa/drivers/dri/i965/brw_vs.c
>     @@ -85,34 +85,17 @@ brw_compute_vue_map(struct brw_context *brw,
>     struct brw_vue_map *vue_map,
>           */
>          switch (intel->gen) {
>          case 4:
>     +   case 5:
>             /* There are 8 dwords in VUE header pre-Ironlake:
>              * dword 0-3 is indices, point width, clip flags.
>              * dword 4-7 is ndc position
>              * dword 8-11 is the first vertex data.
>     -       */
>     -      assign_vue_slot(vue_map, VARYING_SLOT_PSIZ);
>     -      assign_vue_slot(vue_map, BRW_VARYING_SLOT_NDC);
>     -      assign_vue_slot(vue_map, VARYING_SLOT_POS);
>     -      break;
>     -   case 5:
>     -      /* There are 20 DWs (D0-D19) in VUE header on Ironlake:
>     -       * dword 0-3 of the header is indices, point width, clip flags.
>     -       * dword 4-7 is the ndc position
>     -       * dword 8-11 of the vertex header is the 4D space position
>     -       * dword 12-19 of the vertex header is the user clip distance.
>     -       * dword 20-23 is a pad so that the vertex element data is
>     aligned
>     -       * dword 24-27 is the first vertex data we fill.
>              *
>     -       * Note: future pipeline stages expect 4D space position to be
>     -       * contiguous with the other varyings, so we make dword 24-27 a
>     -       * duplicate copy of the 4D space position.
>     +       * On Ironlake the VUE header is nominally 20 dwords, but the
>     hardware
>     +       * will accept the same header layout as Gen4 [and should be
>     a bit faster]
>              */
>             assign_vue_slot(vue_map, VARYING_SLOT_PSIZ);
>             assign_vue_slot(vue_map, BRW_VARYING_SLOT_NDC);
>     -      assign_vue_slot(vue_map, BRW_VARYING_SLOT_POS_DUPLICATE);
>
>
> This was the last use of BRW_VARYING_SLOT_POS_DUPLICATE.  We ought to be
> able to remove that from the enum now (and from the switch statement in
> vec4_visitor::emit_urb_slot()).
>
> With those changes, this patch is:
>
> Reviewed-by: Paul Berry <stereotype441 at gmail.com
> <mailto:stereotype441 at gmail.com>>
>
>     -      assign_vue_slot(vue_map, VARYING_SLOT_CLIP_DIST0);
>     -      assign_vue_slot(vue_map, VARYING_SLOT_CLIP_DIST1);
>     -      assign_vue_slot(vue_map, BRW_VARYING_SLOT_PAD);
>             assign_vue_slot(vue_map, VARYING_SLOT_POS);
>             break;
>          case 6:
>     --
>     1.8.3
>
>     _______________________________________________
>     mesa-dev mailing list
>     mesa-dev at lists.freedesktop.org <mailto:mesa-dev at lists.freedesktop.org>
>     http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
>
>
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>



More information about the mesa-dev mailing list