[Intel-gfx] [PATCH] drm/i915: fix SFC reset flow

Daniele Ceraolo Spurio daniele.ceraolospurio at intel.com
Tue Sep 17 18:36:35 UTC 2019



On 9/17/2019 12:57 AM, Chris Wilson wrote:
> Quoting Daniele Ceraolo Spurio (2019-09-16 22:41:04)
>> Our assumption that the we can ask the HW to lock the SFC even if not
>> currently in use does not match the HW commitment. The expectation from
>> the HW is that SW will not try to lock the SFC if the engine is not
>> using it and if we do that the behavior is undefined; on ICL the HW
>> ends up to returning the ack and ignoring our lock request, but this is
>> not guaranteed and we shouldn't expect it going forward.
>>
>> Reported-by: Owen Zhang <owen.zhang at intel.com>
>> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio at intel.com>
>> Cc: Tvrtko Ursulin <tvrtko.ursulin at linux.intel.com>
>> ---
>> @@ -366,10 +368,13 @@ static u32 gen11_lock_sfc(struct intel_engine_cs *engine)
>>                                           sfc_forced_lock_ack_bit,
>>                                           sfc_forced_lock_ack_bit,
>>                                           1000, 0, NULL)) {
>> -               DRM_DEBUG_DRIVER("Wait for SFC forced lock ack failed\n");
>> +               /* did we race the unlock? */
>> +               if (intel_uncore_read_fw(uncore, sfc_usage) & sfc_usage_bit)
>> +                       DRM_ERROR("Wait for SFC forced lock ack failed\n");
> What's our plan if this *ERROR* is ever triggered?
>
> If it remains do nothing and check the logs on death, then it remains
> just a debug splat. If there is a plan to actually do something to
> handle the error, do it!
> -Chris

AFAIU the only thing we can do is escalate to full gpu reset. However, 
the probability of this failing should be next to non-existent (only one 
engine can use the SFC at any time so there is no lock contention), so 
I'm not convinced the fallback is worth the effort. The error is still 
useful IMO to catch unexpected behavior on new platforms, as it happened 
in this case with the media team reporting seeing this message on gen12 
with the previous behavior. This said, I'm happy to add the extra logic 
if you believe it is worth it.

Daniele



More information about the Intel-gfx mailing list