[Intel-gfx] [PATCH] drm/i915/gt: Disable HiZ Raw Stall Optimization on broken gen7

Simon Rettberg simon.rettberg at rz.uni-freiburg.de
Tue Apr 27 08:33:50 UTC 2021


Am Mon, 26 Apr 2021 19:05:40 +0300
schrieb Ville Syrjälä <ville.syrjala at linux.intel.com>:

> On Mon, Apr 26, 2021 at 04:11:24PM +0200, Simon Rettberg wrote:
> > When resetting CACHE_MODE registers, don't enable HiZ Raw Stall
> > Optimization on Ivybridge GT1 and Baytrail, as it causes severe
> > glitches when rendering any kind of 3D accelerated content.
> > This optimization is disabled on these platforms by default
> > according to official documentation from 01.org.
> > 
> > Fixes: ef99a60ffd9b ("drm/i915/gt: Clear CACHE_MODE prior to
> > clearing residuals") Fixes: 520d05a77b28 ("drm/i915/gt: Clear
> > CACHE_MODE prior to clearing residuals") BugLink:
> > https://gitlab.freedesktop.org/drm/intel/-/issues/3081 BugLink:
> > https://gitlab.freedesktop.org/drm/intel/-/issues/3404 BugLink:
> > https://gitlab.freedesktop.org/drm/intel/-/issues/3071 Reviewed-By:
> > Manuel Bentele <development at manuel-bentele.de> Signed-off-by: Simon
> > Rettberg <simon.rettberg at rz.uni-freiburg.de> ---
> >  drivers/gpu/drm/i915/gt/gen7_renderclear.c | 5 ++++-
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/gen7_renderclear.c
> > b/drivers/gpu/drm/i915/gt/gen7_renderclear.c index
> > de575fdb0..21f08e538 100644 ---
> > a/drivers/gpu/drm/i915/gt/gen7_renderclear.c +++
> > b/drivers/gpu/drm/i915/gt/gen7_renderclear.c @@ -397,7 +397,10 @@
> > static void emit_batch(struct i915_vma * const vma,
> > gen7_emit_pipeline_invalidate(&cmds); batch_add(&cmds,
> > MI_LOAD_REGISTER_IMM(2)); batch_add(&cmds,
> > i915_mmio_reg_offset(CACHE_MODE_0_GEN7));
> > -	batch_add(&cmds, 0xffff0000);
> > +	batch_add(&cmds, 0xffff0000 |
> > +			((IS_IVB_GT1(i915) || IS_VALLEYVIEW(i915))
> > ?
> > +			 HIZ_RAW_STALL_OPT_DISABLE :
> > +			 0));
> >  	batch_add(&cmds, i915_mmio_reg_offset(CACHE_MODE_1));
> >  	batch_add(&cmds, 0xffff0000 |
> > PIXEL_SUBSPAN_COLLECT_OPT_DISABLE);
> > gen7_emit_pipeline_invalidate(&cmds);  
> 
> CACHE_MODE* should be context saved. So there seems to be some kind
> of more fundemental bug in this code if it manages to clobber
> application contexts. Looking at the code it at least tries to
> switch to the kernel context before emitting the w/a batch.

We got a hunch about this while poking at the code, but lack expertise
and knowledge about i915 or drm in general.
The idea was that according to our understanding, this whole code
exists because some state is not properly cleared/restored when
switching between vGPUs. So on a normal desktop system this code gets
called only once, at boot-up. Assuming there isn't an actual bug in the
code somewhere else, could there be a similar issue when switching
between kernel and application context? The fact that there's already
another optimization explicitly disabled for CACHE_MODE_1 in this very
code fragment seems to support this theory.

Anyways, even though this only affects hardware that's close to a
decade old, this is a rather serious issue as it breaks anything 3D
accelerated; this bug has made it into the mainline kernel with 5.10.13,
and as distros will start to pick up newer kernels I can see a lot of
reports pouring in. Ubuntu 21.04 with Kernel 5.11 was just released and
suffers from this issue as well.
So a stop-gap solution like this patch, or just fully reverting the
commit in question might be reasonable if the underlying issue cannot
be found.

Simon


More information about the Intel-gfx mailing list