[Intel-gfx] [RFC 5/7] drm/i915: Wait for GPU to finish before event stop in Gen Perf PMU

Daniel Vetter daniel at ffwll.ch
Thu Jun 25 00:42:01 PDT 2015


On Thu, Jun 25, 2015 at 06:02:35AM +0000, Gupta, Sourab wrote:
> On Mon, 2015-06-22 at 16:09 +0000, Daniel Vetter wrote:
> > On Mon, Jun 22, 2015 at 02:22:54PM +0100, Chris Wilson wrote:
> > > On Mon, Jun 22, 2015 at 03:25:07PM +0530, sourab.gupta at intel.com wrote:
> > > > From: Sourab Gupta <sourab.gupta at intel.com>
> > > > 
> > > > To collect timestamps around any GPU workload, we need to insert
> > > > commands to capture them into the ringbuffer. Therefore, during the stop event
> > > > call, we need to wait for GPU to complete processing the last request for
> > > > which these commands were inserted.
> > > > We need to ensure this processing is done before event_destroy callback which
> > > > deallocates the buffer for holding the data.
> > > 
> > > There's no reason for this to be synchronous. Just that you need an
> > > active reference on the output buffer to be only released after the
> > > final request.
> > 
> > Yeah I think the interaction between OA sampling and GEM is the critical
> > piece here for both patch series. Step one is to have a per-pmu lock to
> > keep track of data private to OA and mmio based sampling as a starting
> > point. Then we need to figure out how to structure the connection without
> > OA/PMU and gem trampling over each another's feet.
> > 
> > Wrt requests those are refcounted already and just waiting on them doesn't
> > need dev->struct_mutex. That should be all you need, assuming you do
> > correctly refcount them ...
> > -Daniel
> 
> Hi Daniel/Chris,
> 
> Thanks for your feedback. I acknowledge the issue with increasing
> coverage scope of i915 dev mutex. I'll have a per-pmu lock to keep track
> of data private to OA in my next version of patches.

If you create new locking schemes please coordinate with Rob about them.
Better if we come up with a consistent locking scheme for all of OA. That
ofc doesn't exclude that we'll have per-pmu locking, just that the locking
design should match if it makes sense. The example I'm thinking of is
drrs&psr which both use the frontbuffer tracking code and both have their
private mutex. But the overall approach to locking and cleaning up the
async work is the same.

> W.r.t. having the synchronous wait during event stop, I thought through
> the method of having active reference on output buffer. This will let us
> have a delayed buffer destroy (controlled by decrement of output buffer
> refcount as and when gpu requests complete). This will be decoupled from
> the event stop here. But that is not the only thing at play here.
> 
> The event stop is expected to disable the OA unit (which it is doing
> right now). In my usecase, I can't disable OA unit, till the time GPU
> processes the MI_RPC requests scheduled already, which otherwise may
> lead to hangs.
> So, I'm waiting synchronously for requests to complete before disabling
> OA unit.
> 
> Also, the event_destroy callback should be destroying the buffers
> associated with event. (Right now, in current design, event_destroy
> callback waits for the above operations to be completed before
> proceeding to destroy the buffer).
> 
> If I try to create a function which destroys the output buffer according
> to all references being released, all these operations will have to be
> performed together in that function, in this serial order (so that when
> refcount drops to 0, i.e. requests have completed, OA unit will be
> disabled, and then the buffer destroyed). This would lead to these
> operations being decoupled from the event_stop and event_destroy
> functions. This may be non-intentional behaviour w.r.t. these callbacks
> from userspace perspective.
> 
> Robert, any thoughts?

Yeah now idea whether those perf callbacks are allowed to stall for a
longer time. If not we could simply push the cleanup/OA disabling into an
async work.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


More information about the Intel-gfx mailing list