[Intel-gfx] [PATCH] drm/i915/guc: Don't read SOFT_SCRATCH(15) on MMIO error

Michal Wajdeczko michal.wajdeczko at intel.com
Thu Jul 12 17:29:58 UTC 2018


On Thu, 12 Jul 2018 17:31:14 +0200, Chris Wilson  
<chris at chris-wilson.co.uk> wrote:

> Quoting Chris Wilson (2018-05-29 15:54:12)
>> Quoting Michal Wajdeczko (2018-05-28 18:16:18)
>> > SOFT_SCRATCH(15) is used by GuC for sending MMIO GuC events to host  
>> and
>> > those events are now handled by  
>> intel_guc_to_host_event_handler_mmio().
>> >
>> > We should not try to read it on MMIO action error as 1) we may be  
>> using
>> > different set of registers for GuC MMIO communication, and 2) GuC may
>> > use CTB mechanism for sending events to host.
>>
>> Ok.
>>
>> > While here, upgrade error message to DRM_ERROR.
>>
>> Does the error help? What do you want to convey to the user? For error
>> handling, we want to propagate the result back anyway for the caller has
>> to decide what to do next.
>
> Good news! We see the error in BAT,
>
> [  542.138479] i915: unknown parameter 'enable_guc_loading' ignored
> [  542.138483] i915: unknown parameter 'enable_guc_submission' ignored
> [  542.138485] Setting dangerous option enable_guc - tainting kernel
> [  542.138488] Setting dangerous option live_selftests - tainting kernel
> [  542.173291] [drm:intel_guc_send_mmio [i915]] *ERROR* MMIO: GuC action  
> 0x10 failed with error -5 0xf000f000
> [  542.367055] i915: probe of 0000:00:02.0 failed with error -25
>
> And Michał reminded me this wasn't the first time...
>
> commit feb06c151fade9ecaa3dd410d792cce26e8b10de
> Author: Michał Winiarski <michal.winiarski at intel.com>
> Date:   Mon Mar 19 10:53:47 2018 +0100
>
>     drm/i915/guc: Demote GuC error messages
>    We're using those functions in selftests, and the callers are expected
>     to do the error handling anyways. Let's demote all GuC actions and
>     doorbell creation to DEBUG_DRIVER.
>
> So do we kindly ask Michał to resubmit his fix?

There are more places where DRM_ERROR is used after detection GuC error
(see intel_guc_send_ct as example, more to show up shortly)

I would rather prefer to add GUC_ERROR macro that could be tweaked
under SELFTEST config and runtime flags to demote unwanted errors:

#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
#define GUC_ERROR(...) \
	do { \
		if (unlikely(i915_selftest.mock || i915_selftest.live)) \
			DRM_DEBUG_DRIVER(__VA_ARGS__); \
		else \
			DRM_ERROR(__VA_ARGS__);
	} while (0);
#else
#define GUC_ERROR(...) DRM_ERROR(__VA_ARGS__)
#endif

/Michal


More information about the Intel-gfx mailing list