[Intel-gfx] [PATCH] drm/i915: set ctx->initialized only after RCS

Ben Widawsky ben at bwidawsk.net
Wed Jan 1 19:10:07 CET 2014


On Tue, Dec 31, 2013 at 11:26:17AM +0000, Chris Wilson wrote:
> On Mon, Dec 30, 2013 at 01:34:46PM -0800, Ben Widawsky wrote:
> > On Sun, Dec 29, 2013 at 09:59:26AM +0000, Chris Wilson wrote:
> > > On Sat, Dec 28, 2013 at 01:31:49PM -0800, Ben Widawsky wrote:
> > > > The initialized flag is used to specify a context has been initialized
> > > > and it's context is safe to load, ie. the 3d state is setup properly.
> > > > With full PPGTT, we emit the address space loads during context switch
> > > > and this currently marks a context as initialized. With full PPGTT
> > > > patches, if a client first emits a batch to !RCS, then later, RCS, the
> > > > code will mistake the context as initialized and try to reload an
> > > > uninitialized context.
> > > > 
> > > > 1. context 1 blit // context initialized
> > > +context marked as initialised
> > > 
> > > > 2. context 2 <X operation> // saves context 1 random state
> > > > 3. context 1 render // loads random state from step 2
> > > 
> > > Note that step 2 is not required since the tracking is per-ring.
> > >  
> > 
> > This is missing an extremely important caveat which I was
> > incorrectly correlating earlier in our discussion. Yes, step 2 is not
> > required.  The step which is required is the page allocated must be
> > non-zero when allocated, and the contents of the page must be capable of
> > hanging the GPU when used as a context object.
> 
> Really? I'm pretty sure the last error state we looked at, the context
> was all zeroes.
>  

Well... the first page of the context is all 0.

> > Otherwise, the uninitialized context would always be all 0, which if I
> > understand the HW correctly, is "safe"
> > 
> > As long as you don't disagree, I'll fix the commit message with that
> > info.
> 
> Whatever you feel matches our best understanding of the problem. Just
> double check that last error state first ;-)
> -Chris
> 

The real problem is that I've gone and convinced myself that no other
means exists to actually make the GPU hang. This incidentally also gives
some merit to my assumption that IPEHR is accurate through context loads
- in other words, we try to do MI_SET_CONTEXT during a MI_SET_CONTEXT.

-- 
Ben Widawsky, Intel Open Source Technology Center



More information about the Intel-gfx mailing list