[PATCH v18 5/6] drm/xe/guc: Plumb GuC-capture into dev coredump

Dong, Zhanjun zhanjun.dong at intel.com
Wed Sep 11 20:07:58 UTC 2024


Please see my comments inline below:

Regards,
Zhanjun

On 2024-09-09 7:23 p.m., Teres Alexis, Alan Previn wrote:
> Thanks again for the diligence on those many recent the offline conversations + testing
> to make the code flow more readible and also tacking the all the race conditions related
> to the link list usage for both the kmd-manual and guc-captures.
> 
> I have 1 trivial issue that i believe needs to fixed and two nits.
> Thus, here is a conditional RB (need that fix or an explaination):
> 
> Reviewed-by: Alan Previn <alan.previn.teres.alexis at intel.com>
> (feel free to keep this on next rev if no additional changes were made).
> 
> 
Thanks Alan.

...
>> @@ -805,13 +834,14 @@ static void
>>   guc_capture_add_node_to_list(struct __guc_capture_parsed_output *node,
>>                               struct list_head *list)
>>   {
>> -       list_add_tail(&node->link, list);
>> +       list_add(&node->link, list);
>>   }
>>   
>>   static void
>>   guc_capture_add_node_to_outlist(struct xe_guc_state_capture *gc,
>>                                  struct __guc_capture_parsed_output *node)
>>   {
>> +       guc_capture_remove_stale_matches_from_list(gc, node);
>>          guc_capture_add_node_to_list(node, &gc->outlist);
>>   }
>>   
>> @@ -822,6 +852,31 @@ guc_capture_add_node_to_cachelist(struct xe_guc_state_capture *gc,
>>          guc_capture_add_node_to_list(node, &gc->cachelist);
>>   }
>>   
>> +static void
>> +guc_capture_free_outlist_node(struct xe_guc_state_capture *gc,
>> +                             struct __guc_capture_parsed_output *n)
>> +{
>> +       if (n) {
>> +               n->locked = 0;
> alan: This function is getting called from two places:
> 
> In case 1, this function is getting called by devcoredump to release
> the locked reference to a matching node that devcoredump requested
> previously when it called "xe_guc_capture_get_matching_and_lock".
> 
> In case 2, this function is called by guc_capture_add_node_to_outlist
> (after a fresh new node is extracted in response to G2H-Err-Capture-Notification)
> to remove any old node that happen to have the same guc-id and engine class. This
> is intentional because if its one form the past, it will never be used for a
> future devcoredump call to get a match.
> 
> 
> However, this function SHOULD check the lock so that:
> - if called from case 2, it should not free it since devcoredump is actually using it.
Yes, that is a bug, and will be fixed by add condition check of !locked 
in next rev.
> - if called from case 1, it should free it since this would happen at the time
> devcoredump is freed after user has retrieved it.
Right, it should be free in this case.

> 
>> +               list_del(&n->link);
>> +               /* put node back to cache list */
>> +               guc_capture_add_node_to_cachelist(gc, n);
>> +       }
>> +}
>> +
>> +static void
>> +guc_capture_remove_stale_matches_from_list(struct xe_guc_state_capture *gc,
>> +                                          struct __guc_capture_parsed_output *node)
>> +{
>> +       struct __guc_capture_parsed_output *n, *ntmp;
>> +       int guc_id = node->guc_id;
>> +
>> +       list_for_each_entry_safe(n, ntmp, &gc->outlist, link) {
>> +               if (n != node && n->guc_id == guc_id)
>> +                       guc_capture_free_outlist_node(gc, n);
>> +       }
>> +}
>> +
>>   static void
>>   guc_capture_init_node(struct xe_guc *guc, struct __guc_capture_parsed_output *node)
>>   {
...


More information about the Intel-xe mailing list