[Intel-gfx] [PATCH] drm/i915: Skip error capture when wedged on init

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Wed Nov 10 11:34:27 UTC 2021


On 10/11/2021 10:48, Matthew Auld wrote:
> On Tue, 9 Nov 2021 at 12:20, Tvrtko Ursulin
> <tvrtko.ursulin at linux.intel.com> wrote:
>>
>> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>
>> Trying to capture uninitialised engines when we wedged on init ends in
>> tears. Skip that together with uC capture, since failure to initialise the
>> latter can actually be one of the reasons for wedging on init.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> 
> This fixes the issue with missing GuC wedging the GPU and then blowing
> up when trying to use the driver?

Probably does not blow up when using the driver, but definitely does 
when accessing error state. Someone suggested it would instead be better 
to call i915_disable_error_state from wedge on init/fini, and I think 
indeed it would, so I plan to send v2 looking like that.

Regards,

Tvrtko

> Reviewed-by: Matthew Auld <matthew.auld at intel.com>
> 
>> ---
>>   drivers/gpu/drm/i915/i915_gpu_error.c | 10 +++++++---
>>   1 file changed, 7 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
>> index 2a2d7643b551..aa2b3aad9643 100644
>> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
>> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
>> @@ -1866,10 +1866,14 @@ i915_gpu_coredump(struct intel_gt *gt, intel_engine_mask_t engine_mask)
>>                  }
>>
>>                  gt_record_info(error->gt);
>> -               gt_record_engines(error->gt, engine_mask, compress);
>>
>> -               if (INTEL_INFO(i915)->has_gt_uc)
>> -                       error->gt->uc = gt_record_uc(error->gt, compress);
>> +               if (!intel_gt_has_unrecoverable_error(gt)) {
>> +                       gt_record_engines(error->gt, engine_mask, compress);
>> +
>> +                       if (INTEL_INFO(i915)->has_gt_uc)
>> +                               error->gt->uc = gt_record_uc(error->gt,
>> +                                                            compress);
>> +               }
>>
>>                  i915_vma_capture_finish(error->gt, compress);
>>
>> --
>> 2.30.2
>>


More information about the Intel-gfx mailing list