[Mesa-dev] [PATCH v4 15/44] i965: Work around L3 state leaks during context switches.

Francisco Jerez currojerez at riseup.net
Wed Dec 9 04:25:20 PST 2015


Jordan Justen <jordan.l.justen at intel.com> writes:

> On 2015-12-08 08:43:53, Francisco Jerez wrote:
>> This is going to require some rather intrusive kernel changes to fix
>> properly, in the meantime (and forever on at least pre-v4.1 kernels)
>> we'll have to restore the hardware defaults at the end of every batch
>> in which the L3 configuration was changed to avoid interfering with
>> the DDX and GL clients that use an older non-L3-aware version of Mesa.
>> 
>> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias at igalia.com>
>> Reviewed-by: Kristian Høgsberg  <krh at bitplanet.net>
>> 
>> v4: Optimize look-up of the default configuration by assuming it's the
>>     first entry of the L3 config array in order to avoid an FPS
>>     regression in GpuTest Triangle and SynMark OglBatch2-7 on most
>>     affected platforms.
>> ---
>>  src/mesa/drivers/dri/i965/brw_state.h         |  4 +++
>>  src/mesa/drivers/dri/i965/gen7_l3_state.c     | 51 +++++++++++++++++++++++++++
>>  src/mesa/drivers/dri/i965/intel_batchbuffer.c |  7 ++++
>>  src/mesa/drivers/dri/i965/intel_batchbuffer.h |  6 +++-
>>  4 files changed, 67 insertions(+), 1 deletion(-)
>> 
>> diff --git a/src/mesa/drivers/dri/i965/brw_state.h b/src/mesa/drivers/dri/i965/brw_state.h
>> index 49f301a..b7c0039 100644
>> --- a/src/mesa/drivers/dri/i965/brw_state.h
>> +++ b/src/mesa/drivers/dri/i965/brw_state.h
>> @@ -380,6 +380,10 @@ void gen7_update_binding_table_from_array(struct brw_context *brw,
>>  void gen7_disable_hw_binding_tables(struct brw_context *brw);
>>  void gen7_reset_hw_bt_pool_offsets(struct brw_context *brw);
>>  
>> +/* gen7_l3_state.c */
>> +void
>> +gen7_restore_default_l3_config(struct brw_context *brw);
>> +
>>  #ifdef __cplusplus
>>  }
>>  #endif
>> diff --git a/src/mesa/drivers/dri/i965/gen7_l3_state.c b/src/mesa/drivers/dri/i965/gen7_l3_state.c
>> index 7956935..7fa7336 100644
>> --- a/src/mesa/drivers/dri/i965/gen7_l3_state.c
>> +++ b/src/mesa/drivers/dri/i965/gen7_l3_state.c
>> @@ -520,3 +520,54 @@ const struct brw_tracked_state gen7_l3_state = {
>>     },
>>     .emit = emit_l3_state
>>  };
>> +
>> +/**
>> + * Hack to restore the default L3 configuration.
>> + *
>> + * This will be called at the end of every batch in order to reset the L3
>> + * configuration to the default values for the time being until the kernel is
>> + * fixed.  Until kernel commit 6702cf16e0ba8b0129f5aa1b6609d4e9c70bc13b
>> + * (included in v4.1) we would set the MI_RESTORE_INHIBIT bit when submitting
>> + * batch buffers for the default context used by the DDX, which meant that any
>> + * context state changed by the GL would leak into the DDX, the assumption
>> + * being that the DDX would initialize any state it cares about manually.  The
>> + * DDX is however not careful enough to program an L3 configuration
>> + * explicitly, and it makes assumptions about it (URB size) which won't hold
>> + * and cause it to misrender if we let our L3 set-up to leak into the DDX.
>> + *
>> + * Since v4.1 of the Linux kernel the default context is saved and restored
>> + * normally, so it's far less likely for our L3 programming to interfere with
>> + * other contexts -- In fact restoring the default L3 configuration at the end
>> + * of the batch will be redundant most of the time.  A kind of state leak is
>> + * still possible though if the context making assumptions about L3 state is
>> + * created immediately after our context was active (e.g. without the DDX
>> + * default context being scheduled in between) because at present the DRM
>> + * doesn't fully initialize the contents of newly created contexts and instead
>> + * sets the MI_RESTORE_INHIBIT flag causing it to inherit the state from the
>> + * last active context.
>> + *
>> + * It's possible to realize such a scenario if, say, an X server (or a GL
>> + * application using an outdated non-L3-aware Mesa version) is started while
>> + * another GL application is running and happens to have modified the L3
>> + * configuration, or if no X server is running at all and a GL application
>> + * using a non-L3-aware Mesa version is started after another GL application
>> + * ran and modified the L3 configuration -- The latter situation can actually
>> + * be reproduced easily on IVB in our CI system.
>> + */
>> +void
>> +gen7_restore_default_l3_config(struct brw_context *brw)
>> +{
>> +   const struct brw_device_info *devinfo = brw->intelScreen->devinfo;
>> +   /* For efficiency assume that the first entry of the array matches the
>> +    * default configuration.
>> +    */
>> +   const struct brw_l3_config *const cfg = get_l3_configs(devinfo);
>> +   assert(cfg == get_l3_config(devinfo,
>> +                               get_default_l3_weights(devinfo, false, false)));
>> +
>> +   if (cfg != brw->l3.config && brw->can_do_pipelined_register_writes) {
>> +      setup_l3_config(brw, cfg);
>> +      update_urb_size(brw, cfg);
>> +      brw->l3.config = cfg;
>> +   }
>> +}
>> diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
>> index 0363bd3..f778074 100644
>> --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
>> +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
>> @@ -208,6 +208,13 @@ brw_finish_batch(struct brw_context *brw)
>>     brw_emit_query_end(brw);
>>  
>>     if (brw->batch.ring == RENDER_RING) {
>> +      /* Work around L3 state leaks into contexts set MI_RESTORE_INHIBIT which
>> +       * assume that the L3 cache is configured according to the hardware
>> +       * defaults.
>
> Should we add a comment to the tables saying that the first entry
> should be the default?
>
Sure.  Added and pushed.

> Reviewed-by: Jordan Justen <jordan.l.justen at intel.com>
>
Thanks.

>> +       */
>> +      if (brw->gen >= 7)
>> +         gen7_restore_default_l3_config(brw);
>> +
>>        /* We may also need to snapshot and disable OA counters. */
>>        brw_perf_monitor_finish_batch(brw);
>>  
>> diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.h b/src/mesa/drivers/dri/i965/intel_batchbuffer.h
>> index 2b177d3..f473690 100644
>> --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.h
>> +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.h
>> @@ -30,8 +30,12 @@ extern "C" {
>>   *     - 5 dwords for initial mi_flush
>>   *     - 2 dwords for CC state setup
>>   *     - 5 dwords for the required pipe control at the end
>> + *   - Restoring L3 configuration: (24 dwords = 96 bytes)
>> + *     - 2*6 dwords for two PIPE_CONTROL flushes.
>> + *     - 7 dwords for L3 configuration set-up.
>> + *     - 5 dwords for L3 atomic set-up (on HSW).
>>   */
>> -#define BATCH_RESERVED 152
>> +#define BATCH_RESERVED 248
>>  
>>  struct intel_batchbuffer;
>>  
>> -- 
>> 2.5.1
>> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20151209/ea1ba2e6/attachment.sig>


More information about the mesa-dev mailing list