[Intel-gfx] [CI-ping 11/15] drm/i915: Prevent machine death on Ivybridge context switching

Daniel Vetter daniel at ffwll.ch
Wed Apr 13 09:33:16 UTC 2016


On Tue, Apr 12, 2016 at 09:03:05PM +0100, Chris Wilson wrote:
> Two concurrent writes into the same register cacheline has the chance of
> killing the machine on Ivybridge and other gen7. This includes LRI
> emitted from the command parser.  The MI_SET_CONTEXT itself serves as
> serialising barrier and prevents the pair of register writes in the first
> packet from triggering the fault.  However, if a second switch-context
> immediately occurs then we may have two adjacent blocks of LRI to the
> same registers which may then trigger the hang. To counteract this we
> need to insert a delay after the second register write using SRM.
> 
> This is easiest to reproduce with something like
> igt/gem_ctx_switch/interruptible that triggers back-to-back context
> switches (with no operations in between them in the command stream,
> which requires the execbuf operation to be interrupted after the
> MI_SET_CONTEXT) but can be observed sporadically elsewhere when running
> interruptible igt. No reports from the wild though, so it must be of low
> enough frequency that no one has correlated the random machine freezes
> with i915.ko
> 
> The issue was introduced with
> commit 2c550183476dfa25641309ae9a28d30feed14379 [v3.19]
> Author: Chris Wilson <chris at chris-wilson.co.uk>
> Date:   Tue Dec 16 10:02:27 2014 +0000
> 
>     drm/i915: Disable PSMI sleep messages on all rings around context switches
> 
> Testcase: igt/gem_ctx_switch/render-interruptible #ivb
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> Cc: Daniel Vetter <daniel at ffwll.ch>
> Cc: Ville Syrjälä <ville.syrjala at linux.intel.com>
> Cc: stable at vger.kernel.org

Reviewed-by: Daniel Vetter <daniel.vetter at ffwll.ch>

> ---
>  drivers/gpu/drm/i915/i915_gem_context.c | 15 ++++++++++++---
>  1 file changed, 12 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index fe580cb9501a..e5ad7b21e356 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -539,7 +539,7 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
>  
>  	len = 4;
>  	if (INTEL_INFO(engine->dev)->gen >= 7)
> -		len += 2 + (num_rings ? 4*num_rings + 2 : 0);
> +		len += 2 + (num_rings ? 4*num_rings + 6 : 0);
>  
>  	ret = intel_ring_begin(req, len);
>  	if (ret)
> @@ -579,6 +579,7 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
>  	if (INTEL_INFO(engine->dev)->gen >= 7) {
>  		if (num_rings) {
>  			struct intel_engine_cs *signaller;
> +			i915_reg_t last_reg = {}; /* keep gcc quiet */
>  
>  			intel_ring_emit(engine,
>  					MI_LOAD_REGISTER_IMM(num_rings));
> @@ -586,11 +587,19 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
>  				if (signaller == engine)
>  					continue;
>  
> -				intel_ring_emit_reg(engine,
> -						    RING_PSMI_CTL(signaller->mmio_base));
> +				last_reg = RING_PSMI_CTL(signaller->mmio_base);
> +				intel_ring_emit_reg(engine, last_reg);
>  				intel_ring_emit(engine,
>  						_MASKED_BIT_DISABLE(GEN6_PSMI_SLEEP_MSG_DISABLE));
>  			}
> +
> +			/* Insert a delay before the next switch! */
> +			intel_ring_emit(engine,
> +					MI_STORE_REGISTER_MEM |
> +					MI_SRM_LRM_GLOBAL_GTT);
> +			intel_ring_emit_reg(engine, last_reg);
> +			intel_ring_emit(engine, engine->scratch.gtt_offset);
> +			intel_ring_emit(engine, MI_NOOP);
>  		}
>  		intel_ring_emit(engine, MI_ARB_ON_OFF | MI_ARB_ENABLE);
>  	}
> -- 
> 2.8.0.rc3
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


More information about the Intel-gfx mailing list