[Intel-gfx] [PATCH 24/51] drm/i915: Update deferred context creation to do explicit request management

Daniel Vetter daniel at ffwll.ch
Wed Feb 25 13:15:57 PST 2015


On Wed, Feb 18, 2015 at 03:27:38PM +0000, John Harrison wrote:
> On 13/02/2015 12:15, Chris Wilson wrote:
> >On Fri, Feb 13, 2015 at 11:48:33AM +0000, John.C.Harrison at Intel.com wrote:
> >>From: John Harrison <John.C.Harrison at Intel.com>
> >>
> >>In execlist mode, context initialisation is deferred until first use of the
> >>given context. This is because execlist mode has many more contexts than legacy
> >>mode and many are never actually used.
> >That's not correct. There are no more contexts in execlists than legacy.
> >There are more ringbuffers, or rather the contexts have an extra state
> >object associated with them.
> Okay, I should have said sub-contexts. Or context state objects. Or
> something.

per-engine ctx state? Naming stuff is hard ;-)

>> >>Previously, the initialisation commands
> >>were written to the ring and tagged with some random request structure via the
> >>OLR. This seemed to be causing a null pointer deference bug under certain
> >>circumstances (BZ:40112).
> >>
> >>This patch adds explicit request creation and submission to the deferred
> >>initialisation code path. Thus removing any reliance on or randomness caused by
> >>the OLR.
> >This is upside down though. The request should be referencing the
> >context (thus instantiating it on demand) and nothing in the context
> >allocation requires the request. The initialisation here should be during
> >i915_request_switch_context(), since it can be entirely shared with
> >legacy.
> >-Chris
> 
> The request does reference the context - the alloc_reques() function takes a
> context object as a parameter. Thus it is impossible for the request to be
> used/supplied/required during context creation. The issue here is the lazy
> initialisation of the per ring context state which requires sending commands
> to the ring on first usage of the given context object on the given ring.
> 
> One problem is that the initialisation request and the batch buffer request
> cannot be merged at the moment. They both use request->batch_obj for
> tracking the command object. Thus this patch only works due to the deferred
> intialisation occurring during the i915_gem_validate_context() call very
> early on in execbuffer() rather than as part of the context switch within
> the batch buffer execution which is much later.

My request struct doesn't have a batch_obj pointer. Where is that from and
why do we need it? Atm just chasing Chris' comments, haven't read the full
series yet.

> I'm not sure what you mean by i915_request_switch_context(). The existing
> i915_switch_context() does now take just a request structure rather than a
> ring/ringbuf/context mixture. However, it is not really a good idea to do
> the context switch automatically as part of creating the request. The
> request creation and request execution could be quite separated in time,
> especially with a scheduler.
> 
> It should be possible to move the deferred initialisation within the context
> switch if the object tracking can be resolved. Thus they could share the
> same request and there would not be effectively two separate execution calls
> at the hardware level. Again, that's potentially work that could be done as
> a follow up task of improving the context management independent of the
> current task of removing the OLR.

I think the biggest risk with adding a separate request for the lrc
deferred init is in accidentally nesting request when someone moves around
the lrc validation. Atm it's at the top of execbuf but we tend to shuffle
things around a lot.

Is there some simple WARN_ON we could smash into the alloc function to
make sure this never happens? ring->olr would be it, but since we want to
kill that that's not great. Or do I see risks which aren't really there?
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch


More information about the Intel-gfx mailing list