[Intel-gfx] WAs in init_clock_gating?

Mon Jul 7 22:50:08 CEST 2014

On Tue, Jul 01, 2014 at 04:51:07PM +0000, Mateo Lozano, Oscar wrote:
> Is there any reason why the WAs are applied in *_init_clock_gating? We
> are finding that some of them are lost during reset, and also the
> default context ends up with wrong values because the render context is
> restored & saved before we get to gen8_init_clok_gating (at least with
> Execlists, I´m not sure this happens with MI_SET_CONTEXT because the
> context won´t be saved until the next switch).

It's a historical accident since _very_ old hw only needed a bit of
frobbing of the display clock gating bits.

> I believe this have been brought to the mailing list a couple of times, like:
> 
> 	drm/i916: Init chv workarounds at render ring init
> 	My bsw is an unhappy camper if we delay the workaround init until init_clock_gating(). Move a bunch of it to the render ring init.
> 
> 	FIXME: need to do this for all platforms since some of the registers
>        	also get clobbered at reset. Just need to figure out which
>       	 registers those actually are. This patch is based on a
>        	slightly educated guess, but verifying on actual hw would
>        	be a good idea. Also should maybe move the init_clock_gating
>        	earlier too since we set up a bunch of clock gating stuff
>        	there that might be important for a properly working GT.
> 
> 	Signed-off-by: Ville Syrjälä <ville.syrjala at linux.intel.com>
> 
> And also:
> 
> http://lists.freedesktop.org/archives/intel-gfx/2013-November/036482.html

My concerns still apply. We need to move all work-arounds to the right
places (a bunch of them also might need to get moved into the runtime pm
code ...), and then we also need some test to make sure this all works.

Since maintaining the full list of all w/a bits is currently out of the
question (our code is too unstructured for this) I think we should have a
per-platform list of w/a relevant registers + maybe bitmasks with stuff to
ignore (e.g. the ring registers where the ring base addr might differ).

Then the test would grab snapshots before after all the following
operations and complain loud if anything changes:
- gpu hangs (on all rings as prep for per-engine reset)
- runtime pm (actually all the different power wells really, e.g. lpsp
  mode)
- system suspend/resum
- module reload

That should at least catch all the bugs we've seen thus far. If it later
on turns out that's not good enough we can go more fancy, but for now I
prefer something simpler ...

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch