valgrind to track down memory issues (Re: Huge memory leak sometime after starting a pipeline)

Thu Nov 6 03:17:53 PST 2014

On Thu, 2014-11-06 at 11:33 +0100, Sergei Vorobyov wrote:

Sergei,

> Sorry to say but the universal (on this list) recipe to use valgrind
> is useless, like a hammer to drive screws.

It seems to me you strong opinions about valgrind for some reason. It's
just a tool, amongst many other tools. Sometimes it's suitable,
sometimes it's not. Sometimes it can be used in a specific environment
or sitation, sometimes not.

However, may I suggest that if experienced developers who have been
working with GStreamer for a very long time keep recommending valgrind
as first thing to try for a particular situation, perhaps there is
actually a reason for that, and if you find it useless in general
perhaps that is partly because you have not figured out how to use it
effectively yet?

> Point is, to see "leaks", you need to stop your application, in which
> case GStreamer produces thousands (miles) of messages of the kind:

As mentioned previously to you, valgrind is a suite of tools. The
memcheck leak checker will only output a list of leaks at the end, that
is true. There is also the 'massif' tool which allows you to track
memory allocations as they build up over time, for example.

> ==8263== 2,360 bytes in 59 blocks are possibly lost in loss record
> 5,656 of 5,790
> ==8263==    at 0x4C2ABA0: malloc (vg_replace_malloc.c:296)
> ==8263==    by 0x56127C9: g_malloc
> (in /lib/x86_64-linux-gnu/libglib-2.0.so.0.4200.0)
> ==8263==    by 0x562970F: g_slice_alloc
> (in /lib/x86_64-linux-gnu/libglib-2.0.so.0.4200.0)
> ==8263==    by 0x537C7C9: ???
> (in /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0.4200.0)
> ==8263==    by 0x537C885: ???
> (in /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0.4200.0)
> ==8263==    by 0x539CE7F: ???
> (in /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0.4200.0)
> ==8263==    by 0x539EDCD: ???
> (in /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0.4200.0)
> ==8263==    by 0x53A352F: g_type_add_interface_static
> (in /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0.4200.0)
> ==8263==    by 0x170E7083: gst_ffmpegmux_register
> (in /usr/lib/x86_64-linux-gnu/gstreamer-1.0/libgstlibav.so)
> ==8263==    by 0x170C8187: plugin_init
> (in /usr/lib/x86_64-linux-gnu/gstreamer-1.0/libgstlibav.so)
> ==8263==    by 0x50D2686: gst_plugin_register_func
> (in /usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0.403.0)
> ==8263==    by 0x50D454C: gst_plugin_load_file
> (in /usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0.403.0)
> ==8263== 
> ==8263== 5,168 (512 direct, 4,656 indirect) bytes in 1 blocks are
> definitely lost in loss record 5,719 of 5,790
> 
> even though you make a clean exit with _unrefs of all kinds and
> gst_deinit (), and the application does not leak at all (if you don't
> stop it and run indefinitely).
> 
> 
> Sure (by the end of the day) you can suppress any kind of messages you
> want to ignore, but this hardly approaches you to the solution of the
> problem. Should you take the above message seriously or ignore? Can
> you ask GStreamer developers "take seriously or ignore?" about each of
> 100.000 such messages?

The GObject/GType system and GLib (and also GStreamer) do a number of
one-time allocations, yes. There are suppression files to help filter
these out. GStreamer unit tests (make check-valgrind) can be run with
valgrind and will error out whenever there's any valgrind output at all,
so these suppression files work quite well, although they're rarely
needed really (see below).

In general valgrind also works best if you have debugging symbols
installed (you are missing debugging symbols for glib here).

Then you should use

  G_SLICE=always-malloc valgrind --leak-check=yes ...

Perhaps also use

  --show-reachable=no --show-possibly-lost=no

for starters, which will cut down the noise quite a bit. Most real leaks
tend to be either 'definitely lost', or they are still reachable and
build up over time somewhere in which case massif might also come in It
handy.

It takes some experience to figure out what's relevant and what not. But
the list of leaks is also sorted by severity/amount, so if you just
start at the end, that's usually where the interesting bits are.

> I observed (without valgrind) a curious thing: the same application
> leaks as hell on Intel's HD4400 IGP (NUC) with Intel's driver, but
> does not leak at all on AMD Radeon, nor on NVidia. As I said, in all
> three cases valgrind uselessly reports miles of possibly and
> definitely lost blocks, which makes all three different cases
> completely indistinguishable.

There are of course types of leaks that valgrind cannot track down (or
won't track by default). No one is suggesting valgrind is the panacea
for all debugging purposes or memory leaks, but it's generally really
useful once you figured out how to use it effectively (just like any
other debugging tool).

The added difficult in the case of GStreamer is that in GStreamer memory
is often allocated somewhere and then ownership is transferred through
the pipeline, and memory tools can't track this ownership transfer. But
figuring out what exactly is being leaked in the first place is usually
the hardest part.

 Cheers
  -Tim

-- 
Tim Müller, Centricular Ltd - http://www.centricular.com