[Mesa-dev] [PATCH] gallium/hud: Sensor extension is segfaulting.

Nicolai Hähnle nhaehnle at gmail.com
Mon Oct 17 08:04:36 UTC 2016


On 13.10.2016 19:29, Steven Toth wrote:
> Round two of the patchset, incorporating feedback
> from nhaehnle at gmail.com
>
> The fixes in this set address bugfix #68169, HUD crashing
> when testing with unigine (heaven).
>
> The bug also manifested itself as a lack of data in
> HUD charts when multiple instanced were created and
> destroyed in a specific order, a use case not witnessed
> with glxgears.
>
> These patches:
> 1. mutex protect the internal object lists.
> 2. reference count object access for destruction purposes.
>
> Patchset tested with glxgears/demo and unigine.
>
> Signed-off-by: Steven Toth <stoth at kernellabs.com>
> ---
>  src/gallium/auxiliary/hud/hud_cpufreq.c      | 33 ++++++++++++++++++----
>  src/gallium/auxiliary/hud/hud_diskstat.c     | 39 ++++++++++++++++++++++----
>  src/gallium/auxiliary/hud/hud_nic.c          | 37 +++++++++++++++++++++----
>  src/gallium/auxiliary/hud/hud_sensors_temp.c | 41 ++++++++++++++++++++++------
>  4 files changed, 125 insertions(+), 25 deletions(-)
>
> diff --git a/src/gallium/auxiliary/hud/hud_cpufreq.c b/src/gallium/auxiliary/hud/hud_cpufreq.c
> index 4501bbb..e193568 100644
> --- a/src/gallium/auxiliary/hud/hud_cpufreq.c
> +++ b/src/gallium/auxiliary/hud/hud_cpufreq.c
> @@ -37,6 +37,8 @@
>  #include "util/list.h"
>  #include "os/os_time.h"
>  #include "util/u_memory.h"
> +#include "util/u_inlines.h"
> +#include "os/os_thread.h"
>  #include <stdio.h>
>  #include <unistd.h>
>  #include <dirent.h>
> @@ -57,21 +59,28 @@ struct cpufreq_info
>     char sysfs_filename[128];
>     uint64_t KHz;
>     uint64_t last_time;
> +   struct pipe_reference refcnt;
>  };
>
>  static int gcpufreq_count = 0;
>  static struct list_head gcpufreq_list;
> +static pipe_mutex gcpufreq_mutex;
>
>  static struct cpufreq_info *
>  find_cfi_by_index(int cpu_index, int mode)
>  {
> +   struct cpufreq_info *ret = 0;
> +   pipe_mutex_lock(gcpufreq_mutex);
>     list_for_each_entry(struct cpufreq_info, cfi, &gcpufreq_list, list) {
>        if (cfi->mode != mode)
>           continue;
> -      if (cfi->cpu_index == cpu_index)
> -         return cfi;
> +      if (cfi->cpu_index == cpu_index) {
> +         ret = cfi;
> +         break;
> +      }
>     }
> -   return 0;
> +   pipe_mutex_unlock(gcpufreq_mutex);
> +   return ret;
>  }
>
>  static int
> @@ -116,8 +125,14 @@ static void
>  free_query_data(void *p)
>  {
>     struct cpufreq_info *cfi = (struct cpufreq_info *)p;
> -   list_del(&cfi->list);
> -   FREE(cfi);
> +   /* atomic dec */
> +   if (pipe_reference(&cfi->refcnt, NULL)) {
> +      pipe_mutex_lock(gcpufreq_mutex);
> +      list_del(&cfi->list);
> +      gcpufreq_count--;
> +      pipe_mutex_unlock(gcpufreq_mutex);
> +      FREE(cfi);
> +   }
>  }
>
>  /**
> @@ -159,6 +174,7 @@ hud_cpufreq_graph_install(struct hud_pane *pane, int cpu_index,
>        return;
>     }
>
> +   pipe_reference(NULL, &cfi->refcnt); /* atomic inc */

There is a race condition here where another thread frees cfi between 
the time that you took it from the linked list and this reference count 
increase.

Anyway, I'm sorry for not having looked more closely before, but the 
whole logic of the cpufreq_info life cycle feels a bit off to me right now.

It looks like you're trying to do a lazy one-time initialization of all 
cpufreq_info objects that are ever required in hud_get_num_cpufreq. 
That's fine as a pattern, but then you simply shouldn't ever free those 
objects in free_query_data.

You'd keep them around until program exit; they will appear in a leak 
check as "still reachable" (and that only if the HUD was used), which is 
fine in my book.

Then the only thing you need to mutex-protect is the initialization, to 
ensure that it happens exactly once and that nobody tries to 
find_cfi_index until the list is initialized.


>     gr->query_data = cfi;
>     gr->query_new_value = query_cfi_load;
>
> @@ -180,8 +196,12 @@ add_object(const char *name, const char *fn, int objmode, int cpu_index)
>     strcpy(cfi->sysfs_filename, fn);
>     cfi->mode = objmode;
>     cfi->cpu_index = cpu_index;
> +   pipe_reference_init(&cfi->refcnt, 1);
> +
> +   pipe_mutex_lock(gcpufreq_mutex);
>     list_addtail(&cfi->list, &gcpufreq_list);
>     gcpufreq_count++;
> +   pipe_mutex_unlock(gcpufreq_mutex);
>  }
>
>  /**
> @@ -205,6 +225,7 @@ hud_get_num_cpufreq(bool displayhelp)
>     /* Scan /sys/devices.../cpu, for every object type we support, create
>      * and persist an object to represent its different metrics.
>      */
> +   pipe_mutex_init(gcpufreq_mutex);
>     list_inithead(&gcpufreq_list);
>     DIR *dir = opendir("/sys/devices/system/cpu");

There's a closedir missing.

I'm going to skip the other files because I believe they follow a 
similar pattern. Just one more thing: as it stands, the query_*_load 
functions will also go wonky when multiple contexts are actually used to 
show a HUD at the same time. I don't think it hurts the stability of the 
program, but it might lead to confusing output when the values of e.g. 
dsi->last_stat and dsi->last_time will interfere between multiple HUD 
contexts.

Nicolai


More information about the mesa-dev mailing list