[Intel-gfx] [PATCH 2/9] lib/ref_tracker: compact stacktraces before printing

Andrzej Hajda andrzej.hajda at intel.com
Fri Feb 18 10:54:54 UTC 2022



On 17.02.2022 16:23, Eric Dumazet wrote:
> On Thu, Feb 17, 2022 at 6:05 AM Andrzej Hajda <andrzej.hajda at intel.com> wrote:
>> In cases references are taken alternately on multiple exec paths leak
>> report can grow substantially, sorting and grouping leaks by stack_handle
>> allows to compact it.
>>
>> Signed-off-by: Andrzej Hajda <andrzej.hajda at intel.com>
>> Reviewed-by: Chris Wilson <chris.p.wilson at intel.com>
>> ---
>>   lib/ref_tracker.c | 35 +++++++++++++++++++++++++++--------
>>   1 file changed, 27 insertions(+), 8 deletions(-)
>>
>> diff --git a/lib/ref_tracker.c b/lib/ref_tracker.c
>> index 1b0c6d645d64a..0e9c7d2828ccb 100644
>> --- a/lib/ref_tracker.c
>> +++ b/lib/ref_tracker.c
>> @@ -1,5 +1,6 @@
>>   // SPDX-License-Identifier: GPL-2.0-or-later
>>   #include <linux/export.h>
>> +#include <linux/list_sort.h>
>>   #include <linux/ref_tracker.h>
>>   #include <linux/slab.h>
>>   #include <linux/stacktrace.h>
>> @@ -14,23 +15,41 @@ struct ref_tracker {
>>          depot_stack_handle_t    free_stack_handle;
>>   };
>>
>> +static int ref_tracker_cmp(void *priv, const struct list_head *a, const struct list_head *b)
>> +{
>> +       const struct ref_tracker *ta = list_entry(a, const struct ref_tracker, head);
>> +       const struct ref_tracker *tb = list_entry(b, const struct ref_tracker, head);
>> +
>> +       return ta->alloc_stack_handle - tb->alloc_stack_handle;
>> +}
>> +
>>   void __ref_tracker_dir_print(struct ref_tracker_dir *dir,
>>                             unsigned int display_limit)
>>   {
>> +       unsigned int i = 0, count = 0;
>>          struct ref_tracker *tracker;
>> -       unsigned int i = 0;
>> +       depot_stack_handle_t stack;
>>
>>          lockdep_assert_held(&dir->lock);
>>
>> +       if (list_empty(&dir->list))
>> +               return;
>> +
>> +       list_sort(NULL, &dir->list, ref_tracker_cmp);
> What is going to be the cost of sorting a list with 1,000,000 items in it ?

Do we really have such cases?


>
> I just want to make sure we do not trade printing at most ~10 references
> (from netdev_wait_allrefs()) to a soft lockup :/ with no useful info
> if something went terribly wrong.
>
> I suggest that you do not sort a potential big list, and instead
> attempt to allocate an array of @display_limits 'struct stack_counts'
>
> I suspect @display_limits will always be kept to a reasonable value
> (less than 100 ?)

I though rather about 16 :)
In theory everything is possible, but do we have real case examples 
which could lead to 100 stack traces?
Maybe some frameworks used by multiple consumers (drivers) ???

>
> struct stack_counts {
>      depot_stack_handle_t stack_handle;
>      unsigned int count;
> }
>
> Then, iterating the list and update the array (that you can keep
> sorted by ->stack_handle)
>
> Then after iterating, print the (at_most) @display_limits handles
> found in the temp array.

OK, could be faster and less invasive.
Other solution would be keeping the array in dir and update in every 
tracker alloc/free, this way we avoid iteration over potentially big 
list, but it would cost memory and since printing is rather rare I am 
not sure if it is worth.

I will try your proposition.

Regards
Andrzej

>
>> +
>>          list_for_each_entry(tracker, &dir->list, head) {
>> -               if (i < display_limit) {
>> -                       pr_err("leaked reference.\n");
>> -                       if (tracker->alloc_stack_handle)
>> -                               stack_depot_print(tracker->alloc_stack_handle);
>> -                       i++;
>> -               } else {
>> +               if (i++ >= display_limit)
>>                          break;
>> -               }
>> +               if (!count++)
>> +                       stack = tracker->alloc_stack_handle;
>> +               if (stack == tracker->alloc_stack_handle &&
>> +                   !list_is_last(&tracker->head, &dir->list))
>> +                       continue;
>> +
>> +               pr_err("leaked %d references.\n", count);
>> +               if (stack)
>> +                       stack_depot_print(stack);
>> +               count = 0;
>>          }
>>   }
>>   EXPORT_SYMBOL(__ref_tracker_dir_print);
>> --
>> 2.25.1
>>



More information about the Intel-gfx mailing list