[Intel-gfx] [PATCH] drm/i915: Skip an engine reset if it recovered before our preparations
Michel Thierry
michel.thierry at intel.com
Sat Dec 16 00:20:56 UTC 2017
On 12/15/2017 4:16 PM, Chris Wilson wrote:
> Quoting Michel Thierry (2017-12-16 00:02:47)
>> Hi,
>>
>> On 12/15/2017 3:52 PM, Chris Wilson wrote:
>>> At the beginning of a reset, we disable the submission method and find
>>> the stuck request. We expect to find a stuck request for we have
>>> declared the engine stalled. However, if we find no active request, the
>>> engine must have recovered from its stall before we could issue a reset,
>>> so let the engine continue on without a reset. If the engine is truly
>>> stuck, we will back soon enough with the next reset attempt.
>>>
>>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>>> Cc: Michel Thierry <michel.thierry at intel.com>
>>> Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
>>> ---
>>> drivers/gpu/drm/i915/i915_drv.c | 14 +++++++-------
>>> 1 file changed, 7 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
>>> index ca9f4b2862eb..6f24435ddffe 100644
>>> --- a/drivers/gpu/drm/i915/i915_drv.c
>>> +++ b/drivers/gpu/drm/i915/i915_drv.c
>>> @@ -2011,19 +2011,19 @@ int i915_reset_engine(struct intel_engine_cs *engine, unsigned int flags)
>>>
>>> GEM_BUG_ON(!test_bit(I915_RESET_ENGINE + engine->id, &error->flags));
>>>
>>> - if (!(flags & I915_RESET_QUIET)) {
>>> - dev_notice(engine->i915->drm.dev,
>>> - "Resetting %s after gpu hang\n", engine->name);
>>> - }
>>> - error->reset_engine_count[engine->id]++;
>>> -
>>> active_request = i915_gem_reset_prepare_engine(engine);
>>> - if (IS_ERR(active_request)) {
>>> + if (IS_ERR_OR_NULL(active_request)) {
>>> DRM_DEBUG_DRIVER("Previous reset failed, promote to full reset\n");
>>> ret = PTR_ERR(active_request);
>>
>> Will a static checker complain about PTR_ERR(NULL)?
>
> It shouldn't. PTR_ERR(NULL) -> 0 is one of the valid tricks of PTR_ERR.
>
>> And the DRM_DEBUG_DRIVER isn't also correct in that case.
>
> Bah, I was betting on those who read this would know that the full chip
> reset was pardoned. If you want, we can just remove the debug.
Yes, the problem is sometimes we only get logs without knowing the code.
I would vote to either remove it or change it to just say 'reset skipped'.
-Michel
More information about the Intel-gfx
mailing list