[Intel-gfx] [PATCH] drm/i915: Mark pending batches correctly on reset

Chris Wilson chris at chris-wilson.co.uk
Wed Oct 26 12:57:41 UTC 2016


On Wed, Oct 26, 2016 at 03:32:20PM +0300, Mika Kuoppala wrote:
> Chris Wilson <chris at chris-wilson.co.uk> writes:
> 
> > On Wed, Oct 26, 2016 at 03:07:59PM +0300, Mika Kuoppala wrote:
> >> For contexts that get their requests NOPed after a reset,
> >> correctly count them as pending.
> >> 
> >> Testcase: igt/tests/gem_reset_stats
> >> Cc: Chris Wilson <chris at chris-wilson.co.uk>
> >> Signed-off-by: Mika Kuoppala <mika.kuoppala at intel.com>
> >
> > We agreed that this was an incorrect interpretation of the robustness
> > api, that neither handles tdr nor scales to multiple timlines.
> >
> 
> I remember agreeing with the active one atleast. Perhaps being
> ignorant on the multiple timelines case.
> 
> Is the reasoning here that there is no actual benefit of marking
> batches pending as it is superflous in replay case. In another
> words, the distinction between batch being queued before
> submission and after, is a moot from userspace point of view?

Yes. And it gives them information that they are not otherwise privy to.
 
> > Currently, we only mark as innocent the contexts/batch executing on the
> > hw on the good rings at the time of the reset.
> >
> 
> I am ok with this. The interpretation of 'pending' changes but it
> is more meaningful if one thinks pending on hardware.

That's my understanding as well. We only mark the affected batches -
either it is guilty and scrapped, or it is innocent and rerun. But we
may still see corruption in the innocent batch (as it may change state
internally but the initial state is not restored upon reset). Everyone
else should not be affected (there is always some dependencies as
corruption may propagate, but we do identify the root).
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


More information about the Intel-gfx mailing list