[Intel-gfx] [PATCH] drm/i915: Allow unready gpu to be reset on gen8
Mika Kuoppala
mika.kuoppala at linux.intel.com
Fri Oct 30 08:18:18 PDT 2015
Chris Wilson <chris at chris-wilson.co.uk> writes:
> On Fri, Oct 30, 2015 at 04:43:49PM +0200, Mika Kuoppala wrote:
>> Gen9 has had demonstrated cases where forcing a not ready gpu
>> into reset has caused system hang [1].
>>
>> Gen8 has never to this date demonstrated such behaviour.
>>
>> In our CI tests bsw sometimes ends up in a state where it claims it
>> is not ready for reset, based on reset request, after gpu hang.
>>
>> Allow gen8 to reset even after claims of nonreadiness in order
>> to keep the gpu accessible. Enhance logging so that it will be
>> clear what conditions led to decision of proceeding or bailing out,
>> so that we will spot if this way of forcing our will against gpu turns
>> out to be foolhardy.
>>
>> References [1]: https://bugs.freedesktop.org/show_bug.cgi?id=89959
>> Cc: Daniel Vetter <daniel.vetter at ffwll.ch>
>> Cc: Tomi Sarvela <tomix.p.sarvela at intel.com>
>> Signed-off-by: Mika Kuoppala <mika.kuoppala at intel.com>
>> ---
>> drivers/gpu/drm/i915/intel_uncore.c | 9 ++++++++-
>> 1 file changed, 8 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
>> index f0f97b2..47c17f2 100644
>> --- a/drivers/gpu/drm/i915/intel_uncore.c
>> +++ b/drivers/gpu/drm/i915/intel_uncore.c
>> @@ -1504,7 +1504,14 @@ not_ready:
>> I915_WRITE(RING_RESET_CTL(engine->mmio_base),
>> _MASKED_BIT_DISABLE(RESET_CTL_REQUEST_RESET));
>>
>> - return -EIO;
>
> Where's the reference for where we hit this EIO on gen8?
>
Internal CI logs, relevant part cutpasted below. If you want
full log holler me in irc.
[ 119.147727] kms_pipe_crc_basic: starting subtest hang-read-crc-pipe-A
[ 124.785063] [drm] stuck on render ring
[ 124.800850] [drm] GPU HANG: ecode 8:0:0xfffffffe, in kms_pipe_crc_ba
[5590], reason: Ring hung, action: reset
[ 124.801154] [drm] GPU hangs can indicate a bug anywhere in the entire
gfx stack, including userspace.
[ 124.801161] [drm] Please file a _new_ bug report on
bugs.freedesktop.org against DRI -> DRM/Intel
[ 124.801167] [drm] drm/i915 developers can then reassign to the right
component if it's not a kernel issue.
[ 124.801173] [drm] The gpu crash dump is required to analyze gpu
hangs, so please always attach it.
[ 124.801179] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[ 124.801785] kobject: 'card0' (ffff880174ad92a0): kobject_uevent_env
[ 124.801940] kobject: 'card0' (ffff880174ad92a0): fill_kobj_path: path
= '/devices/pci0000:00/0000:00:02.0/drm/card0'
[ 124.805032] kobject: 'card0' (ffff880174ad92a0): kobject_uevent_env
[ 124.805089] kobject: 'card0' (ffff880174ad92a0): fill_kobj_path: path
= '/devices/pci0000:00/0000:00:02.0/drm/card0'
[ 125.511744] [drm:gen8_do_reset [i915]] *ERROR* render ring: reset
request timeout
[ 125.511922] [drm] Simulated gpu hang, resetting stop_rings
[ 125.511927] drm/i915: Resetting chip after gpu hang
[ 125.511954] [drm:i915_reset [i915]] *ERROR* Failed to reset chip: -5
[ 125.637612] kms_pipe_crc_basic: exiting, ret=0
[ 125.653608] [drm:intel_lr_context_deferred_alloc [i915]] *ERROR* ring
create req: -5
[ 125.847695] gem_ctx_param_basic: executing
[ 125.850086] [drm:intel_lr_context_deferred_alloc [i915]] *ERROR* ring
create req: -5
[ 125.854482] gem_ctx_param_basic: exiting, ret=99
[ 126.038693] kms_addfb_basic: executing
[ 126.041754] [drm:intel_lr_context_deferred_alloc [i915]] *ERROR* ring
create req: -5
-Mika
>> + if (INTEL_INFO(dev)->gen == 9) {
>> + DRM_ERROR("Reset would risk system stability, bailing out\n");
>> + return -EIO;
>> + }
>> +
>> + DRM_ERROR("Forcing non ready gpu into reset\n");
>> +
>> + return gen6_do_reset(dev);
>> }
>>
>> static int (*intel_get_gpu_reset(struct drm_device *dev))(struct drm_device *)
>> --
>> 2.5.0
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
> --
> Chris Wilson, Intel Open Source Technology Centre
More information about the Intel-gfx
mailing list