[Intel-gfx] [PATCH 04/43] drm/i915: Do a synchronous switch-to-kernel-context on idling
Tvrtko Ursulin
tvrtko.ursulin at linux.intel.com
Thu Mar 7 17:06:58 UTC 2019
On 07/03/2019 13:29, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-03-07 13:07:18)
>>
>> On 06/03/2019 14:24, Chris Wilson wrote:
>>> +static bool switch_to_kernel_context_sync(struct drm_i915_private *i915)
>>> +{
>>> + if (i915_gem_switch_to_kernel_context(i915))
>>> + return false;
>>
>> Is it worth still trying to idle if this fails? Since the timeout is
>> short, maybe reset in idle state bring less havoc than not. It can only
>> fail on memory allocations I think, okay and terminally wedged. In which
>> case it is still okay.
>
> Terminally wedged is hard wired to return 0 (this, the next patch?) so
> that we don't bail during i915_gem_suspend() for this reason.
>
> We do still idle if this fails, as we mark the driver/GPU as wedged.
> Perform a GPU reset so that it hopefully isn't shooting kittens any
> more, and pull a pillow over our heads.
I didn't find a path which idles before wedging if
switch_to_kernel_context_sync fails due failing
i915_gem_switch_to_kernel_context. Where is it?
It is a minor concern don't get me wrong. It is unlikely to fail like
this. I was simply thinking why not try and wait for the current work to
finish before suspending in this case. Might be a better experience
after resume.
> That's pretty much our answer in every case, wedge and hope the users go
> away.
>
>>> +
>>> + if (i915_gem_wait_for_idle(i915,
>>> + I915_WAIT_LOCKED |
>>> + I915_WAIT_FOR_IDLE_BOOST,
>>> + HZ / 10))
>>
>> You'll presumably change HZ / 10 to that new macro.
>
> Yup.
>
>>> static void
>>> i915_gem_idle_work_handler(struct work_struct *work)
>>> {
>>> - struct drm_i915_private *dev_priv =
>>> - container_of(work, typeof(*dev_priv), gt.idle_work.work);
>>> + struct drm_i915_private *i915 =
>>> + container_of(work, typeof(*i915), gt.idle_work.work);
>>> + typeof(i915->gt) *gt = &i915->gt;
>>
>> I am really not sure about the typeof idiom in normal C code. :( It
>> saves a little bit of typing, and a little bit of churn if type name
>> changes, but just feels weird to use it somewhere and somewhere not.
>
> But then we have to name thing! We're not sold on gt; it means quite a
> few different things around the bspec. This bit is actually part that
> I'm earmarking for i915_gem itself (high level idle/power/user management)
> and I'm contemplating i915_gpu for the bits that beneath us but still a
> management layer over hardware (and intel_foo for the bits that talk to
> hardware. Maybe that too will change if we completely split out into
> different modules.)
So you could have left it as is for now and have a smaller diff. But
okay.. have it if you insist.
>
>>> /*
>>> - * New request retired after this work handler started, extend active
>>> - * period until next instance of the work.
>>> + * Flush out the last user context, leaving only the pinned
>>> + * kernel context resident. Should anything unfortunate happen
>>> + * while we are idle (such as the GPU being power cycled), no users
>>> + * will be harmed.
>>> */
>>> - if (new_requests_since_last_retire(dev_priv))
>>> - goto out_unlock;
>>> -
>>> - __i915_gem_park(dev_priv);
>>> + if (!gt->active_requests && !work_pending(>->idle_work.work)) {
>>
>> Could have left the new_requests_since_last_retire for readability.
>
> As always, it changed several times. I actually think considering that
> we play games with active_requests, keeping that in the open is
> important. ~o~
Ok.
>>
>>> + ++gt->active_requests; /* don't requeue idle */
>>> +
>>> + if (!switch_to_kernel_context_sync(i915)) {
>>> + dev_err(i915->drm.dev,
>>> + "Failed to idle engines, declaring wedged!\n");
>>> + GEM_TRACE_DUMP();
>>> + i915_gem_set_wedged(i915);
>>> + }
>>> + i915_retire_requests(i915);
>
>>> @@ -3128,7 +3120,6 @@ int i915_gem_wait_for_idle(struct drm_i915_private *i915,
>>> return err;
>>>
>>> i915_retire_requests(i915);
>>> - GEM_BUG_ON(i915->gt.active_requests);
>>
>> Belongs to this patch?
>
> Yes. See the above chunk.
The idle work handler? More hand holding required I'm afraid.
Regards,
Tvrtko
More information about the Intel-gfx
mailing list