[Mesa-dev] [PATCH 1/2] i965: correctly program MEDIA_VFE_STATE for compute shading

Rogovin, Kevin kevin.rogovin at intel.com
Tue Dec 12 10:59:39 UTC 2017


Just a comment: in truth the MEDIA_VFE_STATE -was- programmed correctly without this patch; it turns out that the PerThreadScratchSpace are the first bits in the bytes holding the scratch base pointer; those first bits are used by the HW (and the GENX pack knows this and accounts for it) to stash state.

In all honesty this patch is not necessary to fix car-chase, the patch is just a readability patch.

My apologies for jumping the gun and not checking if the bits for PerThreadScratchSpace were of the first bits of the BO for scratch space.

Sighs.

In spite of that it is just a readability patch, I think it should land to aid in readability of the code.

 -Kevin

-----Original Message-----
From: Rogovin, Kevin 
Sent: Tuesday, December 12, 2017 12:05 PM
To: mesa-dev at lists.freedesktop.org
Cc: Rogovin, Kevin <kevin.rogovin at intel.com>
Subject: [PATCH 1/2] i965: correctly program MEDIA_VFE_STATE for compute shading

From: Kevin Rogovin <kevin.rogovin at intel.com>

Signed-off-by: Kevin Rogovin <kevin.rogovin at intel.com>
---
 src/mesa/drivers/dri/i965/genX_state_upload.c | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c b/src/mesa/drivers/dri/i965/genX_state_upload.c
index 04a492539a..50ac5bc59f 100644
--- a/src/mesa/drivers/dri/i965/genX_state_upload.c
+++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
@@ -4183,28 +4183,35 @@ genX(upload_cs_state)(struct brw_context *brw)
 
    brw_batch_emit(brw, GENX(MEDIA_VFE_STATE), vfe) {
       if (prog_data->total_scratch) {
-         uint32_t bo_offset;
+         uint32_t per_thread_scratch_value;
 
          if (GEN_GEN >= 8) {
             /* Broadwell's Per Thread Scratch Space is in the range [0, 11]
              * where 0 = 1k, 1 = 2k, 2 = 4k, ..., 11 = 2M.
              */
-            bo_offset = ffs(stage_state->per_thread_scratch) - 11;
+            per_thread_scratch_value = ffs(stage_state->per_thread_scratch) - 11;
          } else if (GEN_IS_HASWELL) {
             /* Haswell's Per Thread Scratch Space is in the range [0, 10]
              * where 0 = 2k, 1 = 4k, 2 = 8k, ..., 10 = 2M.
              */
-            bo_offset = ffs(stage_state->per_thread_scratch) - 12;
+            per_thread_scratch_value = ffs(stage_state->per_thread_scratch) - 12;
          } else {
             /* Earlier platforms use the range [0, 11] to mean [1kB, 12kB]
              * where 0 = 1kB, 1 = 2kB, 2 = 3kB, ..., 11 = 12kB.
              */
-            bo_offset = stage_state->per_thread_scratch / 1024 - 1;
+            per_thread_scratch_value = stage_state->per_thread_scratch / 1024 - 1;
          }
-         vfe.ScratchSpaceBasePointer =
-            rw_bo(stage_state->scratch_bo, bo_offset);
+         vfe.ScratchSpaceBasePointer = rw_bo(stage_state->scratch_bo, 0);
+         vfe.PerThreadScratchSpace = per_thread_scratch_value;
       }
 
+      /* If brw->screen->subslice_total is greater than one, then
+       * devinfo->max_cs_threads stores number of threads per sub-slice;
+       * thus we need to multiply by that number by subslices to get
+       * the actual maximum number of threads; the -1 is because the HW
+       * has a bias of 1 (would not make sense to say the maximum number
+       * of threads is 0).
+       */
       const uint32_t subslices = MAX2(brw->screen->subslice_total, 1);
       vfe.MaximumNumberofThreads = devinfo->max_cs_threads * subslices - 1;
       vfe.NumberofURBEntries = GEN_GEN >= 8 ? 2 : 0;
-- 
2.15.0



More information about the mesa-dev mailing list