[Mesa-stable] [Mesa-dev] [PATCH 2/9] i965/fs: Register allocator shoudn't use grf127 for sends dest

Chema Casanova jmcasanova at igalia.com
Mon Jul 9 14:08:43 UTC 2018


El 09/07/18 a las 06:41, Jason Ekstrand escribió:
> This is a very clever solution to the problem.  I like it. :-)

Thanks to Eric that pointed me to the right direction.

I'm Cc: 1 & 2 to mesa stable.

> Reviewed-by: Jason Ekstrand <jason at jlekstrand.net
> <mailto:jason at jlekstrand.net>>

> I think that's all of them.  I've pushed the XML bump so you should be
> able to land at will.

Thanks for the review, I'm landing them today after the 24h period since
they were submitted.

Chema

> On Sun, Jul 8, 2018 at 5:29 PM Jose Maria Casanova Crespo
> <jmcasanova at igalia.com <mailto:jmcasanova at igalia.com>> wrote:
> 
>     Since Gen8+ Intel PRM states that "r127 must not be used for return
>     address when there is a src and dest overlap in send instruction."
> 
>     This patch implements this restriction creating new
>     grf127_send_hack_node
>     at the register allocator. This node has a fixed assignation to grf127.
> 
>     For vgrf that are used as destination of send messages we create node
>     interfereces with the grf127_send_hack_node. So the register allocator
>     will never assign to these vgrf a register that involves grf127.
> 
>     If dispatch_width > 8 we don't create these interferences to the because
>     all instructions have node interferences between sources and
>     destination.
>     That is enough to avoid the r127 restriction.
> 
>     This fixes CTS tests that raised this issue as they were executed as
>     SIMD8:
> 
>     dEQP-VK.spirv_assembly.instruction.graphics.8bit_storage.8struct_to_32struct.storage_buffer_*int_geom
> 
>     Shader-db results on Skylake:
>        total instructions in shared programs: 7686798 -> 7686797 (<.01%)
>        instructions in affected programs: 301 -> 300 (-0.33%)
>        helped: 1
>        HURT: 0
> 
>        total cycles in shared programs: 337092322 -> 337091919 (<.01%)
>        cycles in affected programs: 22420415 -> 22420012 (<.01%)
>        helped: 712
>        HURT: 588
> 
>     Shader-db results on Broadwell:
> 
>        total instructions in shared programs: 7658574 -> 7658625 (<.01%)
>        instructions in affected programs: 19610 -> 19661 (0.26%)
>        helped: 3
>        HURT: 4
> 
>        total cycles in shared programs: 340694553 -> 340676378 (<.01%)
>        cycles in affected programs: 24724915 -> 24706740 (-0.07%)
>        helped: 998
>        HURT: 916
> 
>        total spills in shared programs: 4300 -> 4311 (0.26%)
>        spills in affected programs: 333 -> 344 (3.30%)
>        helped: 1
>        HURT: 3
> 
>        total fills in shared programs: 5370 -> 5378 (0.15%)
>        fills in affected programs: 274 -> 282 (2.92%)
>        helped: 1
>        HURT: 3
> 
>     v2: Avoid duplicating register classes without grf127. Let's use a node
>         with a fixed assignation to grf127 and create interferences to send
>         message vgrf destinations. (Eric Anholt)
>     v3: Update reference to CTS VK_KHR_8bit_storage failing tests.
>         (Jose Maria Casanova)
>     ---
>      src/intel/compiler/brw_fs_reg_allocate.cpp | 25 ++++++++++++++++++++++
>      1 file changed, 25 insertions(+)
> 
>     diff --git a/src/intel/compiler/brw_fs_reg_allocate.cpp
>     b/src/intel/compiler/brw_fs_reg_allocate.cpp
>     index ec8e116cb38..59e047483c0 100644
>     --- a/src/intel/compiler/brw_fs_reg_allocate.cpp
>     +++ b/src/intel/compiler/brw_fs_reg_allocate.cpp
>     @@ -548,6 +548,9 @@ fs_visitor::assign_regs(bool allow_spilling,
>     bool spill_all)
>         int first_mrf_hack_node = node_count;
>         if (devinfo->gen >= 7)
>            node_count += BRW_MAX_GRF - GEN7_MRF_HACK_START;
>     +   int grf127_send_hack_node = node_count;
>     +   if (devinfo->gen >= 8 && dispatch_width == 8)
>     +      node_count ++;
>         struct ra_graph *g =
>            ra_alloc_interference_graph(compiler->fs_reg_sets[rsi].regs,
>     node_count);
> 
>     @@ -653,6 +656,28 @@ fs_visitor::assign_regs(bool allow_spilling,
>     bool spill_all)
>            }
>         }
> 
>     +   if (devinfo->gen >= 8 && dispatch_width == 8) {
>     +      /* At Intel Broadwell PRM, vol 07, section "Instruction Set
>     Reference",
>     +       * subsection "EUISA Instructions", Send Message (page 990):
>     +       *
>     +       * "r127 must not be used for return address when there is a
>     src and
>     +       * dest overlap in send instruction."
>     +       *
>     +       * We are avoiding using grf127 as part of the destination of
>     send
>     +       * messages adding a node interference to the
>     grf127_send_hack_node.
>     +       * This node has a fixed asignment to grf127.
>     +       *
>     +       * We don't apply it to SIMD16 because previous code avoids
>     any register
>     +       * overlap between sources and destination.
>     +       */
>     +      ra_set_node_reg(g, grf127_send_hack_node, 127);
>     +      foreach_block_and_inst(block, fs_inst, inst, cfg) {
>     +         if (inst->is_send_from_grf() && inst->dst.file == VGRF) {
>     +            ra_add_node_interference(g, inst->dst.nr
>     <http://dst.nr>, grf127_send_hack_node);
>     +         }
>     +      }
>     +   }
>     +
>         /* Debug of register spilling: Go spill everything. */
>         if (unlikely(spill_all)) {
>            int reg = choose_spill_reg(g);
>     -- 
>     2.17.1
> 
>     _______________________________________________
>     mesa-dev mailing list
>     mesa-dev at lists.freedesktop.org <mailto:mesa-dev at lists.freedesktop.org>
>     https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
> 
> 
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 


More information about the mesa-stable mailing list