[PATCH v17 4/7] drm/xe/guc: Extract GuC error capture lists
Teres Alexis, Alan Previn
alan.previn.teres.alexis at intel.com
Wed Aug 28 00:22:42 UTC 2024
So I compared the v15 vs v17 and looked at the difference..
I already did a full review in v14 and v15 addressed v14.
That said, everything looks good here except a couple of minor nits.
Reviewed-by: Alan Previn <alan.previn.teres.alexis at intel.com>
On Tue, 2024-08-27 at 14:47 -0700, Zhanjun Dong wrote:
> Upon the G2H Notify-Err-Capture event, parse through the
> GuC Log Buffer (error-capture-subregion) and generate one or
> more capture-nodes. A single node represents a single "engine-
> instance-capture-dump" and contains at least 3 register lists:
> global, engine-class and engine-instance. An internal link
> list is maintained to store one or more nodes.
>
>
alan:snip
> +static void
> +guc_capture_init_node(struct xe_guc *guc, struct __guc_capture_parsed_output *node)
> +{
> + struct guc_mmio_reg *tmp[GUC_STATE_CAPTURE_TYPE_MAX];
> + int i;
> +
> + for (i = 0; i < GUC_STATE_CAPTURE_TYPE_MAX; ++i) {
> + tmp[i] = node->reginfo[i].regs;
> + memset(tmp[i], 0, sizeof(struct guc_mmio_reg) *
> + guc->capture->max_mmio_per_node);
> + }
> + memset(node, 0, sizeof(*node));
> + for (i = 0; i < GUC_STATE_CAPTURE_TYPE_MAX; ++i)
> + node->reginfo[i].regs = tmp[i];
> +
> + INIT_LIST_HEAD(&node->link);
> +}
> +
> +/**
> + * DOC: Init, G2H-event and reporting flows for GuC-error-capture
> + *
> + * KMD Init time flows:
> + * --------------------
> + * --> alloc A: GuC input capture regs lists (registered to GuC via ADS).
> + * xe_guc_ads acquires the register lists by calling
> + * xe_guc_capture_list_size and xe_guc_capture_list_get 'n' times,
alan: looks like we've been updating the codes but not this documentation:
:%s/xe_guc_capture_list_size/xe_guc_capture_getlistsize
:%s/xe_guc_capture_list_get/xe_guc_capture_getlist
> + * where n = 1 for global-reg-list +
> + * num_engine_classes for class-reg-list +
> + * num_engine_classes for instance-reg-list
> + * (since all instances of the same engine-class type
> + * have an identical engine-instance register-list).
> + * ADS module also calls separately for PF vs VF.
> + *
> + * --> alloc B: GuC output capture buf (registered via guc_init_params(log_param))
> + * Size = #define CAPTURE_BUFFER_SIZE (warns if on too-small)
> + * Note2: 'x 3' to hold multiple capture groups
> + *
> + * GUC Runtime notify capture:
> + * --------------------------
> + * --> G2H STATE_CAPTURE_NOTIFICATION
> + * L--> xe_guc_capture_process
> + * L--> Loop through B (head..tail) and for each engine instance's
> + * err-state-captured register-list we find, we alloc 'C':
> + * --> alloc C: A capture-output-node structure that includes misc capture info along
> + * with 3 register list dumps (global, engine-class and engine-instance)
> + * This node is created from a pre-allocated list of blank nodes in
> + * guc->capture->cachelist and populated with the error-capture
> + * data from GuC and then it's added into guc->capture->outlist linked
> + * list. This list is used for matchup and printout by xe_devcoredump_read
> + * and xe_hw_engine_snapshot_print, (when user invokes the devcoredump sysfs).
> + *
> + * GUC --> notify context reset:
> + * -----------------------------
> + * --> guc_exec_queue_timedout_job
> + * L--> xe_devcoredump
> + * L--> devcoredump_snapshot(..IS_GUC_CAPTURE)
alan: nit: can we remove the "IS_GUC_CAPTURE"
> + * --> xe_hw_engine_snapshot_capture(..IS_GUC_CAPTURE)
alan: nit: same here.. remove the "IS_GUC_CAPTURE"
> + *
> + * User Sysfs / Debugfs
> + * --------------------
> + * --> xe_devcoredump_read->
> + * L--> xxx_snapshot_print
> + * L--> xe_hw_engine_snapshot_print
> + * Print register lists values saved at
> + * guc->capture->outlist
> + *
> + */
> +
alan:snip
More information about the Intel-xe
mailing list