[Intel-gfx] [PATCH 04/43] drm/i915: Do a synchronous switch-to-kernel-context on idling
Chris Wilson
chris at chris-wilson.co.uk
Thu Mar 7 13:29:38 UTC 2019
Quoting Tvrtko Ursulin (2019-03-07 13:07:18)
>
> On 06/03/2019 14:24, Chris Wilson wrote:
> > +static bool switch_to_kernel_context_sync(struct drm_i915_private *i915)
> > +{
> > + if (i915_gem_switch_to_kernel_context(i915))
> > + return false;
>
> Is it worth still trying to idle if this fails? Since the timeout is
> short, maybe reset in idle state bring less havoc than not. It can only
> fail on memory allocations I think, okay and terminally wedged. In which
> case it is still okay.
Terminally wedged is hard wired to return 0 (this, the next patch?) so
that we don't bail during i915_gem_suspend() for this reason.
We do still idle if this fails, as we mark the driver/GPU as wedged.
Perform a GPU reset so that it hopefully isn't shooting kittens any
more, and pull a pillow over our heads.
That's pretty much our answer in every case, wedge and hope the users go
away.
> > +
> > + if (i915_gem_wait_for_idle(i915,
> > + I915_WAIT_LOCKED |
> > + I915_WAIT_FOR_IDLE_BOOST,
> > + HZ / 10))
>
> You'll presumably change HZ / 10 to that new macro.
Yup.
> > static void
> > i915_gem_idle_work_handler(struct work_struct *work)
> > {
> > - struct drm_i915_private *dev_priv =
> > - container_of(work, typeof(*dev_priv), gt.idle_work.work);
> > + struct drm_i915_private *i915 =
> > + container_of(work, typeof(*i915), gt.idle_work.work);
> > + typeof(i915->gt) *gt = &i915->gt;
>
> I am really not sure about the typeof idiom in normal C code. :( It
> saves a little bit of typing, and a little bit of churn if type name
> changes, but just feels weird to use it somewhere and somewhere not.
But then we have to name thing! We're not sold on gt; it means quite a
few different things around the bspec. This bit is actually part that
I'm earmarking for i915_gem itself (high level idle/power/user management)
and I'm contemplating i915_gpu for the bits that beneath us but still a
management layer over hardware (and intel_foo for the bits that talk to
hardware. Maybe that too will change if we completely split out into
different modules.)
> > /*
> > - * New request retired after this work handler started, extend active
> > - * period until next instance of the work.
> > + * Flush out the last user context, leaving only the pinned
> > + * kernel context resident. Should anything unfortunate happen
> > + * while we are idle (such as the GPU being power cycled), no users
> > + * will be harmed.
> > */
> > - if (new_requests_since_last_retire(dev_priv))
> > - goto out_unlock;
> > -
> > - __i915_gem_park(dev_priv);
> > + if (!gt->active_requests && !work_pending(>->idle_work.work)) {
>
> Could have left the new_requests_since_last_retire for readability.
As always, it changed several times. I actually think considering that
we play games with active_requests, keeping that in the open is
important. ~o~
>
> > + ++gt->active_requests; /* don't requeue idle */
> > +
> > + if (!switch_to_kernel_context_sync(i915)) {
> > + dev_err(i915->drm.dev,
> > + "Failed to idle engines, declaring wedged!\n");
> > + GEM_TRACE_DUMP();
> > + i915_gem_set_wedged(i915);
> > + }
> > + i915_retire_requests(i915);
> > @@ -3128,7 +3120,6 @@ int i915_gem_wait_for_idle(struct drm_i915_private *i915,
> > return err;
> >
> > i915_retire_requests(i915);
> > - GEM_BUG_ON(i915->gt.active_requests);
>
> Belongs to this patch?
Yes. See the above chunk.
-Chris
More information about the Intel-gfx
mailing list