[Intel-gfx] [RFC v3 7/7] drm/i915/guc: Print the GuC error capture output register list.
Teres Alexis, Alan Previn
alan.previn.teres.alexis at intel.com
Tue Jan 11 21:54:22 UTC 2022
In RFC rev2, Matt Brost requested for a comparison of the error capture from execlist vs guc-capture.
I've added that data into the following links:
gem_exec_capture_errordump_ADLS_execlist : https://pastebin.com/RBwkHFNq
gem_exec_capture_errordump_ADLS_gucsubmission: https://pastebin.com/8k5p3kSZ
This result is obtained after an additional fix reported below.
I dont think i can make them an exact match, but its close enough and gem_exec_capture-capture passes.
...alan
On Tue, 2022-01-11 at 01:30 -0800, Alan Previn wrote:
> Print the GuC captured error state register list (string names
> and values) when gpu_coredump_state printout is invoked via
> the i915 debugfs for flushing the gpu error-state that was
> captured prior.
>
>
>
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
> index 048b1b7b9259..04b6d25abd47 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_capture.c
> + }
> + for (ee = gt->engine; ee; ee = ee->next) {
> + const struct i915_vma_coredump *vma;
> +
> + if (ee->engine == eng && ee->gucinfo.eng_id == guc_enginst &&
> + ee->gucinfo.guc_id == guc_gucid &&
> + (ee->gucinfo.lrca & CTX_GTT_ADDRESS_MASK) ==
> + (guc_lrca & CTX_GTT_ADDRESS_MASK)) {
>
There is a bug in above code - discovered this morning after additional debug of certain subtest failures:
- if (ee->engine == eng && ee->gucinfo.eng_id == guc_enginst &&
+ if (ee->engine == eng &&
+ guc_enginst == GUC_ID_TO_ENGINE_INSTANCE(ee->gucinfo.eng_id) &&
+ guc_engclss == GUC_ID_TO_ENGINE_CLASS(ee->gucinfo.eng_id) &&
ee->gucinfo.guc_id == guc_gucid &&
(ee->gucinfo.lrca & CTX_GTT_ADDRESS_MASK) ==
(guc_lrca & CTX_GTT_ADDRESS_MASK)) {
> + PRINT(&i915->drm, ebuf, "i915-Ctx-VMA-Matched:\n");
> + GCAP_PRINT_BATCH(i915, ebuf, ee, batch);
> + PRINT(&i915->drm, ebuf, " engine reset count: %u\n",
> + ee->reset_count);
> + ctx = &ee->context;
> + GCAP_PRINT_CONTEXT(i915, ebuf, ctx);
> +
> + for (vma = ee->vma; vma; vma = vma->next)
> + intel_gpu_error_print_vma(ebuf, ee->engine, vma);
> + }
> + }
More information about the Intel-gfx
mailing list