[Intel-gfx] [RFC v3 7/7] drm/i915/guc: Print the GuC error capture output register list.

Teres Alexis, Alan Previn alan.previn.teres.alexis at intel.com
Tue Jan 11 21:54:22 UTC 2022


In RFC rev2, Matt Brost requested for a comparison of the error capture from execlist vs guc-capture.
I've added that data into the following links:

gem_exec_capture_errordump_ADLS_execlist : https://pastebin.com/RBwkHFNq
gem_exec_capture_errordump_ADLS_gucsubmission: https://pastebin.com/8k5p3kSZ

This result is obtained after an additional fix reported below.
I dont think i can make them an exact match, but its close enough and gem_exec_capture-capture passes.

...alan


On Tue, 2022-01-11 at 01:30 -0800, Alan Previn wrote:
> Print the GuC captured error state register list (string names
> and values) when gpu_coredump_state printout is invoked via
> the i915 debugfs for flushing the gpu error-state that was
> captured prior.
> 
> 

> 
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
> index 048b1b7b9259..04b6d25abd47 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
> +		}
> +		for (ee = gt->engine; ee; ee = ee->next) {
> +			const struct i915_vma_coredump *vma;
> +
> +			if (ee->engine == eng && ee->gucinfo.eng_id == guc_enginst &&
> +			    ee->gucinfo.guc_id == guc_gucid &&
> +			    (ee->gucinfo.lrca & CTX_GTT_ADDRESS_MASK) ==
> +			    (guc_lrca & CTX_GTT_ADDRESS_MASK)) {
> 

There is a bug in above code - discovered this morning after additional debug of certain subtest failures:

-                       if (ee->engine == eng && ee->gucinfo.eng_id == guc_enginst &&
+                       if (ee->engine == eng &&
+                           guc_enginst == GUC_ID_TO_ENGINE_INSTANCE(ee->gucinfo.eng_id) &&
+                           guc_engclss == GUC_ID_TO_ENGINE_CLASS(ee->gucinfo.eng_id) &&
                            ee->gucinfo.guc_id == guc_gucid &&
                            (ee->gucinfo.lrca & CTX_GTT_ADDRESS_MASK) ==
                            (guc_lrca & CTX_GTT_ADDRESS_MASK)) {




> +				PRINT(&i915->drm, ebuf, "i915-Ctx-VMA-Matched:\n");
> +				GCAP_PRINT_BATCH(i915, ebuf, ee, batch);
> +				PRINT(&i915->drm, ebuf, "  engine reset count: %u\n",
> +				      ee->reset_count);
> +				ctx = &ee->context;
> +				GCAP_PRINT_CONTEXT(i915, ebuf, ctx);
> +
> +				for (vma = ee->vma; vma; vma = vma->next)
> +					intel_gpu_error_print_vma(ebuf, ee->engine, vma);
> +			}
> +		}


More information about the Intel-gfx mailing list