[PATCH v17 4/7] drm/xe/guc: Extract GuC error capture lists

Teres Alexis, Alan Previn alan.previn.teres.alexis at intel.com
Wed Aug 28 00:22:42 UTC 2024


So I compared the v15 vs v17 and looked at the difference..
I already did a full review in v14 and v15 addressed v14.
That said, everything looks good here except a couple of minor nits.

Reviewed-by: Alan Previn <alan.previn.teres.alexis at intel.com>


On Tue, 2024-08-27 at 14:47 -0700, Zhanjun Dong wrote:
> Upon the G2H Notify-Err-Capture event, parse through the
> GuC Log Buffer (error-capture-subregion) and generate one or
> more capture-nodes. A single node represents a single "engine-
> instance-capture-dump" and contains at least 3 register lists:
> global, engine-class and engine-instance. An internal link
> list is maintained to store one or more nodes.
> 
> 
alan:snip

> +static void
> +guc_capture_init_node(struct xe_guc *guc, struct __guc_capture_parsed_output *node)
> +{
> +       struct guc_mmio_reg *tmp[GUC_STATE_CAPTURE_TYPE_MAX];
> +       int i;
> +
> +       for (i = 0; i < GUC_STATE_CAPTURE_TYPE_MAX; ++i) {
> +               tmp[i] = node->reginfo[i].regs;
> +               memset(tmp[i], 0, sizeof(struct guc_mmio_reg) *
> +                      guc->capture->max_mmio_per_node);
> +       }
> +       memset(node, 0, sizeof(*node));
> +       for (i = 0; i < GUC_STATE_CAPTURE_TYPE_MAX; ++i)
> +               node->reginfo[i].regs = tmp[i];
> +
> +       INIT_LIST_HEAD(&node->link);
> +}
> +
> +/**
> + * DOC: Init, G2H-event and reporting flows for GuC-error-capture
> + *
> + * KMD Init time flows:
> + * --------------------
> + *     --> alloc A: GuC input capture regs lists (registered to GuC via ADS).
> + *                  xe_guc_ads acquires the register lists by calling
> + *                  xe_guc_capture_list_size and xe_guc_capture_list_get 'n' times,
alan: looks like we've been updating the codes but not this documentation:
:%s/xe_guc_capture_list_size/xe_guc_capture_getlistsize
:%s/xe_guc_capture_list_get/xe_guc_capture_getlist
> + *                  where n = 1 for global-reg-list +
> + *                            num_engine_classes for class-reg-list +
> + *                            num_engine_classes for instance-reg-list
> + *                               (since all instances of the same engine-class type
> + *                                have an identical engine-instance register-list).
> + *                  ADS module also calls separately for PF vs VF.
> + *
> + *     --> alloc B: GuC output capture buf (registered via guc_init_params(log_param))
> + *                  Size = #define CAPTURE_BUFFER_SIZE (warns if on too-small)
> + *                  Note2: 'x 3' to hold multiple capture groups
> + *
> + * GUC Runtime notify capture:
> + * --------------------------
> + *     --> G2H STATE_CAPTURE_NOTIFICATION
> + *                   L--> xe_guc_capture_process
> + *                           L--> Loop through B (head..tail) and for each engine instance's
> + *                                err-state-captured register-list we find, we alloc 'C':
> + *      --> alloc C: A capture-output-node structure that includes misc capture info along
> + *                   with 3 register list dumps (global, engine-class and engine-instance)
> + *                   This node is created from a pre-allocated list of blank nodes in
> + *                   guc->capture->cachelist and populated with the error-capture
> + *                   data from GuC and then it's added into guc->capture->outlist linked
> + *                   list. This list is used for matchup and printout by xe_devcoredump_read
> + *                   and xe_hw_engine_snapshot_print, (when user invokes the devcoredump sysfs).
> + *
> + * GUC --> notify context reset:
> + * -----------------------------
> + *     --> guc_exec_queue_timedout_job
> + *                   L--> xe_devcoredump
> + *                          L--> devcoredump_snapshot(..IS_GUC_CAPTURE)
alan: nit: can we remove the "IS_GUC_CAPTURE"
> + *                               --> xe_hw_engine_snapshot_capture(..IS_GUC_CAPTURE)
alan: nit: same here.. remove the "IS_GUC_CAPTURE"
> + *
> + * User Sysfs / Debugfs
> + * --------------------
> + *      --> xe_devcoredump_read->
> + *             L--> xxx_snapshot_print
> + *                    L--> xe_hw_engine_snapshot_print
> + *                         Print register lists values saved at
> + *                         guc->capture->outlist
> + *
> + */
> +
alan:snip


More information about the Intel-xe mailing list