[Intel-gfx] [RFC 1/2] drm/i915: Improve record of hung engines in error state

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Wed Nov 4 13:03:56 UTC 2020


On 04/11/2020 12:30, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2020-11-04 12:20:42)
>> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>
>> Between events which trigger engine and GPU resets and capturing the error
>> state we lose information on which engine triggered the reset. Improve
>> this by passing in the hung engine mask down to error capture.
>>
>> Result is that the list of engines in user visible "GPU HANG: ecode
>> <gen>:<engines>:<ecode>, <process>" is now a list of hanging and not just
>> active engines. Most importantly the displayed process is now the one
>> which was actually hung.
> 
> You could also suggest to only include the hanging engine in the report,
> as is intended to be the normal means of generating the report

I thought it is potentially useful to have a full picture, but can do 
that as well.

>> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.h b/drivers/gpu/drm/i915/i915_gpu_error.h
>> index 0220b0992808..3a7ca90a3436 100644
>> --- a/drivers/gpu/drm/i915/i915_gpu_error.h
>> +++ b/drivers/gpu/drm/i915/i915_gpu_error.h
>> @@ -59,6 +59,7 @@ struct i915_request_coredump {
>>   struct intel_engine_coredump {
>>          const struct intel_engine_cs *engine;
>>   
>> +       bool hung;
>>          bool simulated;
>>          u32 reset_count;
>>   
>> @@ -218,8 +219,10 @@ struct drm_i915_error_state_buf {
>>   __printf(2, 3)
>>   void i915_error_printf(struct drm_i915_error_state_buf *e, const char *f, ...);
>>   
>> -struct i915_gpu_coredump *i915_gpu_coredump(struct drm_i915_private *i915);
>> -void i915_capture_error_state(struct drm_i915_private *i915);
>> +struct i915_gpu_coredump *i915_gpu_coredump(struct intel_gt *gt,
>> +                                           intel_engine_mask_t engine_mask);
>> +void i915_capture_error_state(struct intel_gt *gt,
>> +                             intel_engine_mask_t engine_mask);
> 
> Don't forget the stubs.

Right, thanks.

Regards,

Tvrtko


More information about the Intel-gfx mailing list