[Intel-gfx] [PATCH 24/46] drm/i915: Do a synchronous switch-to-kernel-context on idling
Chris Wilson
chris at chris-wilson.co.uk
Thu Feb 21 21:17:09 UTC 2019
Quoting Daniele Ceraolo Spurio (2019-02-21 19:48:01)
>
> <snip>
>
> > @@ -4481,19 +4471,7 @@ int i915_gem_suspend(struct drm_i915_private *i915)
> > * state. Fortunately, the kernel_context is disposable and we do
> > * not rely on its state.
> > */
> > - if (!i915_terminally_wedged(&i915->gpu_error)) {
> > - ret = i915_gem_switch_to_kernel_context(i915);
> > - if (ret)
> > - goto err_unlock;
> > -
> > - ret = i915_gem_wait_for_idle(i915,
> > - I915_WAIT_INTERRUPTIBLE |
> > - I915_WAIT_LOCKED |
> > - I915_WAIT_FOR_IDLE_BOOST,
> > - HZ / 5);
> > - if (ret == -EINTR)
> > - goto err_unlock;
> > -
> > + if (!switch_to_kernel_context_sync(i915)) { > /* Forcibly cancel outstanding work and leave the gpu quiet. */
> > i915_gem_set_wedged(i915);
> > }
>
> GuC-related question: what's your expectation here in regards to GuC
> status? The current i915 flow expect either uc_reset_prepare() or
> uc_suspend() to be called to clean up the guc status, but we're calling
> neither of them here if the switch is successful. Do you expect the
> resume code to always blank out the GuC status before a reload?
(A few patches later on I propose that we always just do a reset+wedge
on suspend in lieu of hangcheck.)
On resume, we have to bring the HW up from scratch and do another reset
in the process. Some platforms have been known to survive the trips to
PCI_D3 (someone is lying!) and so we _have_ to do a reset to be sure we
clear the HW state. I expect we would need to force a reset on resume
even for the guc, to be sure we cover all cases such as kexec.
-Chris
More information about the Intel-gfx
mailing list