[Libva] Huge memory leak in SandyBridge while decoding h264 (big buck bunny)

Josep Torra n770galaxy at gmail.com
Mon Jul 2 05:58:04 PDT 2012


Dear libva,

For long I'm aware of certain ambiguity in libva regarding how
vaBuffers have to be handled.

According to documentation:

/*
 * After this call, the buffer is deleted and this buffer_id is no longer valid
 * Only call this if the buffer is not going to be passed to vaRenderBuffer
 */
VAStatus vaDestroyBuffer (
    VADisplay dpy,
    VABufferID buffer_id
);

Which make sense because there's no way for the vaapi client to know
when a buffer is no longer in use when it's given for processing to
vaRenderPicture. To clarify I understand the behaviour in the API as
vaRenderPicture takes the ownership of the buffer in terms of
refcounting and the driver is responsible to release it when it's no
longer needed.

While the design is correct in my understanding the unfortunate thing
is that IEGD/EMGD drivers wasn't implementing this behaviour(API)
properly and huge memory leaks was exhibited when those drivers were
used. Also I want to remark that the PSB driver did it properly and
wasn't leaking last time I've checked.

I think that the i965 driver used to implement the correct behaviour
prior to 1.0.15 or maybe I didn't notice the leaks when I tried.

In decoder implementations like mplayer and gstreamer-vaapi[1] an
explicit call to vaDestroyBuffer was introduced, this for sure
workaround the problem on the bad drivers but also hides bugs the new
drivers(i965) and allows that the wrong behaviour be the rule instead
of the exception. If those are the reference decoder implementations
used to validate the driver then it should implement the API usage
properly and if they workaround an issue with a driver it should be
conditional to that driver.

[1] http://gitorious.org/vaapi/gstreamer-vaapi/blobs/master/gst-libs/gst/vaapi/gstvaapidecoder_objects.c#line263

We at Fluendo added such kind of workaround conditionally to the
IEGD/EMGD drivers as we try to respect the API as it's documented.
Recently we received some claims of huge memory leaks while decoding
the big buck bunny on a SNB platform (triggering the OOM killer).

The customer had been able to reproduce the leaks at different rates
depending on the system he used:

1) Ubuntu 12.04 libva/intel-vaapi 1.0.15
2) Debian testing with 1.0.16
3) 1.0.17. and 1.0.18 (leaks at lower rate that 1 and 2)

I've installed a stock ubuntu 12.04 (32 bits) on a SNB box and
certainly I'm able to reproduce the memleak by just running:

gst-launch-0.10 filesrc location=big_buck_bunny_1080p_h264.mov
num-buffers=4000 ! qtdemux ! fluvadec ! fakesink silent=true

>From Valgrind I'm getting the following (big one) and some other VA
related leaks:

==16071== 1,197,824 bytes in 9,358 blocks are possibly lost in loss
record 1,644 of 1,644
==16071==    at 0x402A5E6: calloc (in
/usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==16071==    by 0x518F8F1: ??? (in
/usr/lib/i386-linux-gnu/libdrm_intel.so.1.0.0)
==16071==    by 0x518AF13: drm_intel_bo_alloc (in
/usr/lib/i386-linux-gnu/libdrm_intel.so.1.0.0)
==16071==    by 0x51008F4: i965_create_buffer_internal.isra.10
(i965_drv_video.c:988)
==16071==    by 0x50AB417: vaCreateBuffer (in
/usr/lib/i386-linux-gnu/libva.so.1.3200.0)
==16071==    by 0x4E645C7: va32CreateBuffer (va32.c:387)
==16071==    by 0x4E5D11F: vaCreateBuffer (gstva_backend.c:234)
==16071==    by 0x4E6859D: fluvaapi_h264_add_slice (fluvaapi_decoder_h264.c:257)

Please clarify the API and documentation or the reference decoders in
order to resolve the the situation of ambiguity.

Certainly if the current API is correct then must be a bug in the i965
driver (reference leak???) which is hidden if mplayer and
gstreamer-vaapi are used for testing, removing the pointed explicit
vaDestroyBufffer should reveal it you should be able to fix it.

Thanks in advance,

Josep Torra
FLUENDO, S.A.


More information about the Libva mailing list