[Mesa-dev] [PATCH 07/10] i965: Initial stab at GS URB space allocation.

Paul Berry stereotype441 at gmail.com
Mon Dec 5 09:40:50 PST 2011


From: Kenneth Graunke <kenneth at whitecape.org>

The 50/50 split is just an attempt to get things working.  We likely
want to tune this, and probably want to avoid allocating the GS any
space if we're not using it.

For now, this is good enough.

Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
---
 src/mesa/drivers/dri/i965/gen6_urb.c |   49 +++++++++++++++++++++++++--------
 1 files changed, 37 insertions(+), 12 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen6_urb.c b/src/mesa/drivers/dri/i965/gen6_urb.c
index d045bf2..a1d0d74 100644
--- a/src/mesa/drivers/dri/i965/gen6_urb.c
+++ b/src/mesa/drivers/dri/i965/gen6_urb.c
@@ -31,35 +31,60 @@
 #include "brw_state.h"
 #include "brw_defines.h"
 
+/**
+ * The following diagram shows how we partition the URB on Sandybridge:
+ *
+ *           URB size / 2                   URB size / 2
+ *   _____________-______________   _____________-______________
+ *  /                            \ /                            \
+ * +-------------------------------------------------------------+
+ * | Vertex Shader Entries        | Geometry Shader Entries      |
+ * +-------------------------------------------------------------+
+ *
+ * Currently we split the URB space evenly between the VS and the GS.
+ * This is not ideal (especially when we're not using GS), but simple.
+ *
+ * Sandybridge GT1 has 32kB of URB space, while GT2 has 64kB.
+ * (See the Sandybridge PRM, Volume 2, Part 1, Section 1.4.7: 3DSTATE_URB.)
+ */
 static void
 gen6_upload_urb( struct brw_context *brw )
 {
    struct intel_context *intel = &brw->intel;
-   int nr_vs_entries;
+   int nr_vs_entries, nr_gs_entries;
+   int total_urb_size = brw->urb.size * 1024; /* in bytes */
 
    /* CACHE_NEW_VS_PROG */
    brw->urb.vs_size = MAX2(brw->vs.prog_data->urb_entry_size, 1);
 
-   /* Calculate how many VS URB entries fit in the total URB size */
-   nr_vs_entries = (brw->urb.size * 1024) / (brw->urb.vs_size * 128);
+   /* We use the same VUE layout for VS outputs and GS outputs (as it's what
+    * the SF and Clipper expect), so we can simply make the GS URB entry size
+    * the same as for the VS.  This may technically be too large in cases
+    * where we have few vertex attributes and a lot of varyings, since the VS
+    * size is determined by the larger of the two.  For now, it's safe.
+    */
+   brw->urb.gs_size = brw->urb.vs_size;
 
+   /* Calculate how many entries fit in each stage's section of the URB */
+   nr_vs_entries = (total_urb_size/2) / (brw->urb.vs_size * 128);
+   nr_gs_entries = (total_urb_size/2) / (brw->urb.gs_size * 128);
+
+   /* Then clamp to the maximum allowed by the hardware */
    if (nr_vs_entries > brw->urb.max_vs_entries)
       nr_vs_entries = brw->urb.max_vs_entries;
 
-   /* According to volume 2a, nr_vs_entries must be a multiple of 4. */
-   brw->urb.nr_vs_entries = ROUND_DOWN_TO(nr_vs_entries, 4);
+   if (nr_gs_entries > brw->urb.max_gs_entries)
+      nr_gs_entries = brw->urb.max_gs_entries;
 
-   /* Since we currently don't support Geometry Shaders, we always put the
-    * GS unit in passthrough mode and don't allocate it any URB space.
-    */
-   brw->urb.nr_gs_entries = 0;
-   brw->urb.gs_size = 1; /* Incorrect, but with 0 GS entries it doesn't matter. */
+   /* Finally, both must be a multiple of 4 (see 3DSTATE_URB in the PRM). */
+   brw->urb.nr_vs_entries = ROUND_DOWN_TO(nr_vs_entries, 4);
+   brw->urb.nr_gs_entries = ROUND_DOWN_TO(nr_gs_entries, 4);
 
    assert(brw->urb.nr_vs_entries >= 24);
    assert(brw->urb.nr_vs_entries % 4 == 0);
    assert(brw->urb.nr_gs_entries % 4 == 0);
-   /* GS requirement */
-   assert(!brw->gs.prog_active || brw->urb.vs_size < 5);
+   assert(brw->urb.vs_size < 5);
+   assert(brw->urb.gs_size < 5);
 
    BEGIN_BATCH(3);
    OUT_BATCH(_3DSTATE_URB << 16 | (3 - 2));
-- 
1.7.6.4



More information about the mesa-dev mailing list