[Intel-gfx] [PATCH 1/7] drm/i915/guc: Reset GuC and retry on firmware load failure
david.s.gordon at intel.com
Wed Mar 23 16:03:18 UTC 2016
On 21/03/16 16:58, Arun Siluvery wrote:
> On 21/03/2016 10:16, Dave Gordon wrote:
>> From: Arun Siluvery <arun.siluvery at linux.intel.com>
>> Due to timing issues in the HW some of the status bits required for GuC
>> authentication doesn't get set occassionally, when that happens, GuC
>> be initialized and we will be left with a wedged GPU. The WA suggested is
>> to perform a soft reset of GuC and attempt to reload the fw again for few
>> times before giving up.
>> As the failure is dependent on timing, tests performed by triggering
>> full gpu reset (i915_wedged) showed that we could sometimes hit this
>> several thousand iterations but sometimes tests ran even longer
>> without any
>> issues. Reset and reload mechanism proved helpful when we indeed hit fw
>> load failure so it is better to include this to improve driver stability.
>> This change implements the following WA,
>> Signed-off-by: Arun Siluvery <arun.siluvery at linux.intel.com>
>> Signed-off-by: Dave Gordon <david.s.gordon at intel.com>
>> Cc: Alex Dai <yu.dai at intel.com>
> This patch was previously reviewed by Alex,
OK, I'm going to repost just the first two from this set, tagged with
R-Bs from Alex & you, and the Bugzilla reference; then the rest can form
a new sequence for Alex to review when he gets back from vacation.
>> diff --git a/drivers/gpu/drm/i915/i915_reg.h
>> index 07e0449..cc71ca2 100644
>> --- a/drivers/gpu/drm/i915/i915_reg.h
>> +++ b/drivers/gpu/drm/i915/i915_reg.h
>> @@ -165,6 +165,7 @@ static inline bool i915_mmio_reg_valid(i915_reg_t
>> #define GEN6_GRDOM_MEDIA (1 << 2)
>> #define GEN6_GRDOM_BLT (1 << 3)
>> #define GEN6_GRDOM_VECS (1 << 4)
>> +#define GEN9_GRDOM_GUC (1 << 5)
>> #define GEN8_GRDOM_MEDIA2 (1 << 7)
> In the original patch GEN9_GRDOM_GUC was defined like above but during
> tdr patch review, option of arranging them according to gen was
> explored. In that case GEN9_GRDOM_GUC will be at the end.
I think they're better in bit-number order, because that way it's easier
to see that you haven't accidentally got overlapping definitions.
More information about the Intel-gfx