[Mesa-dev] [PATCH v4 00/15] TGSI: improved live range tracking, also including arrays

Benedikt Schemmer ben at besd.de
Wed Jun 6 19:58:19 UTC 2018


Hi Gert, Hi Nicolai,

I did play around with this quite a lot (mostly with the previous version) and found it to be
stable (doesn't crash deus ex on start up from cold shader cache like NIR) or leak memory
like llvm 7 (at least used to leak ~100MB with piglit shaders,
haven't given it another try in the last two weeks).

The arrays-of-arrays piglit shaders I sent you still crash though Gert.

On radeonsi it is mostly (or probably entirely) useless ;)
But it does have an interesting property when I tweak it (see below), results of today with v4 and llvm 6:

 BIGGEST IMPROVEMENTS - Max Waves
 Before After     Delta Percentage
      3     8         5  166.67 %   ../shader-db/shaders/piglit/988eefdf4bb05e5a3ecc4a5c0b4c4ef54047a5a9_4.shader_test [1]
      5     7         2   40.00 %   ../shader-db/shaders/ruiner/0967c5fce7fc456496b1cfa25fbb1d1c4dcf9bed_2958.shader_test [0]
      5     7         2   40.00 %   ../shader-db/shaders/cat/1847.shader_test [0]
      6     7         1   16.67 %   ../shader-db/shaders/serioussam2017/2160.shader_test [0]
      6     7         1   16.67 %   ../shader-db/shaders/serioussam2017/f1e9a7f8bb8be17b8231116cf68bc10677769709_2149.shader_test [0]
      1     2         1  100.00 %   ../shader-db/shaders/deusex_mankind/5b1006a267c95f5c245266d82699d53cad704aab_4008.shader_test [0]
      1     2         1  100.00 %   ../shader-db/shaders/deusex_mankind/14060e8c592408bb9b6059b27a72eeb1e66c7480_8288.shader_test [0]
      1     2         1  100.00 %   ../shader-db/shaders/deusex_mankind/cb6356f4da76e5a82d6036e811c680ae00249c68_994.shader_test [0]

PERCENTAGE DELTAS    Shaders     SGPRs     VGPRs SpillSGPR SpillVGPR  PrivVGPR   Scratch  CodeSize  MaxWaves     Waits
All affected             630   -6.28 %   -1.19 %   -0.07 %     .         .         .        0.02 %    0.42 %     .

No regressions with max waves otherwise (but of course I don't own every game and
even from those I don't have a shaderdump of every one of them).

But still, from the values I found there may be an opportunity for an 'intermediate range'
optimization somewhere.

Cheers,
Benedikt


---

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -5533,21 +5533,22 @@ glsl_to_tgsi_visitor::merge_registers(void)
    }


-   if (get_temp_registers_required_live_ranges(reg_live_ranges, &this->instructions,
+//   if (get_temp_registers_required_live_ranges(reg_live_ranges, &this->instructions,
+     get_temp_registers_required_live_ranges(reg_live_ranges, &this->instructions,
                                                this->next_temp, reg_live_ranges,
-                                               this->next_array, arr_live_ranges)) {
+                                               this->next_array, arr_live_ranges);//) {
       struct rename_reg_pair *renames =
             rzalloc_array(reg_live_ranges, struct rename_reg_pair, this->next_temp);
       get_temp_registers_remapping(reg_live_ranges, this->next_temp,
                                    reg_live_ranges, renames);
       rename_temp_registers(renames);

-      this->next_array =  merge_arrays(this->next_array, this->array_sizes,
-                                       &this->instructions, arr_live_ranges);
+//      this->next_array =  merge_arrays(this->next_array, this->array_sizes,
+//                                       &this->instructions, arr_live_ranges);

       if (arr_live_ranges)
          delete[] arr_live_ranges;
-   }
+//   }
    ralloc_free(reg_live_ranges);
 }

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
--- a/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
@@ -1330,6 +1330,24 @@ static int register_merge_record_compare (const void *a, const void *b) {
 }
 #endif

+/* these magic numbers are determined by looking at the results of shader-db */
+bool should_merge (int distance)
+{
+   switch (distance) {
+       case 12 ... 126: //lower bound interfering with llvm?, upper bound here
+       case 244 ... 768: // and lower bound here determined by one regressing tombraider shader
+// nothing to see here
+       case 2432 ... 2496: // purely empiricily determined
+       case 2497 ... 2623: // Deus Ex
+       case 2624 ... 2688:
+//     case 2689 ... 2943: // causes regressions in ubershaders
+       case 2944 ... 3072: // above isnt used
+          return true;
+       default:
+          return false;
+   }
+}
+
 /* This functions evaluates the register merges by using a binary
  * search to find suitable merge candidates. */
 void get_temp_registers_remapping(void *mem_ctx, int ntemps,
@@ -1361,10 +1379,18 @@ void get_temp_registers_remapping(void *mem_ctx, int ntemps,
    register_merge_record *first_erase = reg_access_end;
    register_merge_record *search_start = trgt + 1;

+   int rename_distance;
+
    while (trgt != reg_access_end) {
       register_merge_record *src = find_next_rename(search_start, reg_access_end,
                                             trgt->end);
-      if (src != reg_access_end) {
+      //if (src != reg_access_end) {
+
+      rename_distance = src->begin - search_start->begin;
+
+      if ((src != reg_access_end) &&
+          (should_merge(rename_distance))) {
+
          result[src->reg].new_reg = trgt->reg;
          result[src->reg].valid = true;
          trgt->end = src->end;


More information about the mesa-dev mailing list