[PATCH v26 4/6] drm/xe/guc: Extract GuC error capture lists

Dong, Zhanjun zhanjun.dong at intel.com
Fri Oct 4 02:21:28 UTC 2024


Thanks Alan.
Will update it and post another rev soon.

Regards,
Zhanjun

On 2024-10-03 4:57 p.m., Teres Alexis, Alan Previn wrote:
> I've reviewed the delta from my last rb on this patch and only found a minor nit which you can choose to ignore.
> Thus:
> Reviewed-by: Alan Previn <alan.previn.teres.alexis at intel.com>
> thanks.
> 
> 
> 
> On Thu, 2024-10-03 at 08:16 -0700, Zhanjun Dong wrote:
>> Upon the G2H Notify-Err-Capture event, parse through the
>> GuC Log Buffer (error-capture-subregion) and generate one or
>> more capture-nodes. A single node represents a single "engine-
>> instance-capture-dump" and contains at least 3 register lists:
>> global, engine-class and engine-instance. An internal link
>> list is maintained to store one or more nodes.
> alan:snip
> 
> 
>> +struct __guc_capture_parsed_output {
>> +       /*
>> +        * A single set of 3 capture lists: a global-list
>> +        * an engine-class-list and an engine-instance list.
>> +        * outlist in __guc_capture_parsed_output will keep
>> +        * a linked list of these nodes that will eventually
>> +        * be detached from outlist and attached into to
>> +        * xe_codedump in response to a context reset
>> +        */
>> +       struct list_head link;
>> +       bool is_partial;
>> +       u32 eng_class;
>> +       u32 eng_inst;
>> +       u32 guc_id;
>> +       u32 lrca;
>> +       struct gcap_reg_list_info {
>> +               u32 vfid;
>> +               u32 num_regs;
>> +               struct guc_mmio_reg *regs;
>> +       } reginfo[GUC_STATE_CAPTURE_TYPE_MAX];
>> +#define GCAP_PARSED_REGLIST_INDEX_GLOBAL   BIT(GUC_STATE_CAPTURE_TYPE_GLOBAL)
>> +#define GCAP_PARSED_REGLIST_INDEX_ENGCLASS BIT(GUC_STATE_CAPTURE_TYPE_ENGINE_CLASS)
>> +};
>> +
>>   /*
>>    * Define all device tables of GuC error capture register lists
>>    * NOTE:
>> @@ -221,6 +267,12 @@ struct xe_guc_state_capture {
>>                                                  [GUC_STATE_CAPTURE_TYPE_MAX]
>>                                                  [GUC_CAPTURE_LIST_CLASS_MAX];
>>          void *ads_null_cache;
>> +       struct list_head cachelist;
>> +#define PREALLOC_NODES_MAX_COUNT (3 * GUC_MAX_ENGINE_CLASSES * GUC_MAX_INSTANCES_PER_CLASS)
>> +#define PREALLOC_NODES_DEFAULT_NUMREGS 64
>> +
>> +       int max_mmio_per_node;
>> +       struct list_head outlist;
>>   };
>>   
>>   static const struct __guc_mmio_reg_descr_group *
>> @@ -451,7 +503,10 @@ guc_cap_list_num_regs(struct xe_guc *guc, u32 owner, u32 type,
>>                  num_regs += match->num_regs;
>>          else
>>                  /* Estimate steering register size for rcs/ccs */
> 
> alan: nit: maybe more clarity in the comment is required here else one might be
> wondering why we are still providing a number when extlists returned null.
> Please correct me if im wrong, but i assume it might read something like:
> "if a caller wants the full register dump size but we have not yet got the hw-config,
> which is before max_mmio_per_node is uninitialized, then provide a worst-case number
> for extlists based on max dss fuse bits, but only ever for render/compute"
>> -               if (capture_class == GUC_CAPTURE_LIST_CLASS_RENDER_COMPUTE)
>> +               if (owner == GUC_CAPTURE_LIST_INDEX_PF &&
>> +                   type == GUC_STATE_CAPTURE_TYPE_ENGINE_CLASS &&
>> +                   capture_class == GUC_CAPTURE_LIST_CLASS_RENDER_COMPUTE &&
>> +                   !guc->capture->max_mmio_per_node)
>>                          num_regs += guc_capture_get_steer_reg_num(guc_to_xe(guc)) *
>>                                      XE_MAX_DSS_FUSE_BITS;
>>   
> alan:snip


More information about the Intel-xe mailing list