[Intel-gfx] [PATCH 17/17] drm/i915: Use rt priority kthread to do GuC log buffer sampling

Chris Wilson chris at chris-wilson.co.uk
Thu Jul 21 05:43:43 UTC 2016


On Thu, Jul 21, 2016 at 09:11:42AM +0530, Goel, Akash wrote:
> 
> 
> On 7/21/2016 1:04 AM, Chris Wilson wrote:
> >In the end, just the silly locking and placement of complete_all() is
> >dangerous. reinit_completion() lacks the barrier to be used like this
> >really, at any rate, racy with the irq handler, so use sparingly or when
> >you control the irq handler.
> Sorry I forgot to add a comment that
> guc_cancel_log_flush_work_sync() should be invoked only after
> ensuring that there will be no more flush interrupts, which will
> happen either by explicitly disabling the interrupt or disabling the
> logging and that's what is done at the 2 call sites.
> 
> Since had covered reinit_completion() under the irq_lock, thought an
> explicit barrier is not needed.

You hadn't controlled everything via the irq_lock, and nor should you.

> 
> 	spin_lock_irq(&dev_priv->irq_lock);
> 	if (guc->log.flush_signal) {
> 		guc->log.flush_signal = false;
> 		reinit_completion(&guc->log.flush_completion);
> 		spin_unlock_irq(&dev_priv->irq_lock);
> 		i915_guc_capture_logs(&dev_priv->drm);
> 	 	complete_all(&guc->log.flush_completion);
> 
> The placement of complete_all isn't right for the case, where
> a guc_cancel_log_flush_work_sync() is called but there was no prior
> flush interrupt received.

Exactly.
 
> >(Also not sure if log.signal = 0 is sane,
> Did log.signal = 0 for fast cancellation. Will remove that.
> 
> A smp_wmb() after reinit_completion(&flush_completion) would be fine ?

Don't worry, the race can only be controlled by controlling the irq. In
the end, I think something more like

	while (signal) ...

	complete_all();
	schedule();
	reinit_completion();

is the simplest.

> >or the current callsites really require the flush.)
> 
> Sync against a ongoing/pending flush is being done for the 2
> forceful flush cases, which will be effective only if the pending
> flush is completed, so forceful flush should be serialized with a
> pending flush.

Or you just signal=true, wakeup task, wait_timeout. Otherwise you
haven't really serialized anything without disabling the interrupt.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


More information about the Intel-gfx mailing list