[Intel-gfx] [PATCH 2/2] drm/i915: Reorder hw init to avoid executing with invalid context/mm state

Fri Dec 5 06:38:46 PST 2014

On Fri, Dec 05, 2014 at 04:31:35PM +0200, Ville Syrjälä wrote:
> On Fri, Dec 05, 2014 at 02:15:22PM +0000, Chris Wilson wrote:
> > Currently we initialise the rings, add the first context switch to the
> > ring and execute our golden state then enable (aliasing or full) ppgtt.
> > However, as we enable ppgtt using direct MMIO but load the PD using
> > MI_LRI, we end up executing the context switch and golden render state
> > with an invalid PD generating page faults. To solve this issue, first do
> > the ppgtt PD setup, then set the default context and write the commands
> > to run the render state into the ring, before we activate the ring. This
> > allows us to be sure that the register state is valid before we begin
> > execution.
> > 
> > This was spotted when writing the seqno/request conversion, but only with
> > the ERROR capture did I realise that it was a necessity now.
> > 
> > RFC: cleanup the error handling in i915_gem_init_hw.
> > 
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > ---
> >  drivers/gpu/drm/i915/i915_gem.c         | 20 ++++++++++----------
> >  drivers/gpu/drm/i915/intel_ringbuffer.c |  9 ++++++---
> >  2 files changed, 16 insertions(+), 13 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index c1c11418231b..c13842d3cbc9 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -4796,15 +4796,15 @@ i915_gem_init_hw(struct drm_device *dev)
> >  	 */
> >  	init_unused_rings(dev);
> >  
> > -	for_each_ring(ring, dev_priv, i) {
> > -		ret = ring->init_hw(ring);
> > -		if (ret)
> > -			return ret;
> > -	}
> > -
> >  	for (i = 0; i < NUM_L3_SLICES(dev); i++)
> >  		i915_gem_l3_remap(&dev_priv->ring[RCS], i);
> 
> This is going to assume ring->head/tail are already valid?

We write into the ring obj, not the ring itself, which should be setup
during the various intel_init_engine, i.e. the backing storage is
independent of the actual registers.

If you take it to the logical conclusion, each operation on the ring is
just a transaction that is buffered until it is commited to the
hardware, or rolled back in its entirety if aborted. We can then queue
up multiple transactions before first enabling the hardware, or equally
preserve such transactions across a GPU reset.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre