[Bug 98420] Memory leak in drm_atomic.c eventually (few days) consuming all RAM (on at least one system configuration)

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Tue Oct 25 17:06:01 UTC 2016


https://bugs.freedesktop.org/show_bug.cgi?id=98420

Ville Syrjala <ville.syrjala at linux.intel.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |NEEDINFO

--- Comment #1 from Ville Syrjala <ville.syrjala at linux.intel.com> ---
(In reply to Felix Monninger from comment #0)
> Created attachment 127524 [details] [review]
> Patch to drm_atomic.c calling the required drm_property_unreference_blob's
> 
> Problem:
> I noticed that RAM irretrievably gets lost over the course of a few days.
> "cat /proc/meminfo |grep SUnreclaim" frequently (after 2 or 3 days of
> uptime) grew to high amounts (>2GB) of unreclaimable SLABs until
> out-of-memory failure.
> 
> System:
> Linux 4.7.4+; 4.8.* (problem has been in Kernel since at least Feb 16)
> Intel Corporation 3rd Gen Core processor Graphics Controller (rev 09)
> i915
> 
> Investigation:
> 1. /proc/slabinfo showed that lot of 4K memory blocks "kmalloc-4096" have
> been allocated (visible after modifying slab.c to include those of size>4096
> into the output). Example line from /proc/slabinfo:
> kmalloc-4096       11574  11578   4432    7    8 : tunables    0    0    0 :
> slabdata   1654   1654      0
> 2. /proc/slab_allocators showed (again after modifying slab.c):
> kmalloc-4096: 1954 drm_property_create_blob.part.19+0x27/0xe0 [drm]
> (the numbers are from different times, in any case growing unboundedly until
> reboot)
> 3. using ftrace revealed the following callstack being processed precisely
> every .5 seconds:
> drm_property_create_blob <- drm_atomic_helper_legacy_gamma_set
> <-drm_mode_gamma_set_ioctl <- drm_ioctl
> 4. Reference counts on the "blob" allocated in
> drm_atomic_helper.c:drm_atomic_helper_legacy_gamma_set increased by two is
> then passed to after the line "ret = drm_atomic_crtc_set_property(crtc,
> crtc_state, config->gamma_lut_property, blob->base.id);". (This should only
> be incremented by 1 as ownership of this blob is passed to the crtc_state
> which then keeps the blob as an updated property value.)
> 5. As the blob is passed by id ("..., blob->base.id)"), in
> drm_atomic.c:drm_atomic_replace_property_blob_from_id the function
> drm_property_lookup_blob(dev, blob_id); is called. Note that the function
> manual says "If successful, this takes an additional reference to the blob
> property. callers need to make sure to eventually unreference the returned
> property again, using @drm_property_unreference_blob.", which is not being
> done in this case.
> 6. This leads to the old state->degamma_lut that is replaced by the updated
> blob never being freed (even after its refcount being properly decremented
> by 1 at the remaining places), since its refcount has been incremented once
> too much. Thus every half second an 4K block is wasted.
> 
> Fix:
> We call drm_property_unreference_blob(new_blob) at the appropriate spots in
> drm_atomic.c:drm_atomic_replace_property_blob_from_id . Please see the
> attached patch.

Looks like a solid fix. Can you redo the patch with your full name and
signed-off-by line as per Documentation/SubmittingPatches, and either send
directly to dri-devel at lists.freedesktop.org with git send-email or attach here
and someone can forward it along.

> 
> I wonder, which hardware is affected by this "legacy" code (i. e.
> drm_atomic_helper_legacy_gamma_set)? Only older intel HD graphics (<4000?)
> devices, or is it actually more widely used despite the legacy naming?

It's "legacy" from the the point of view of our userspace API. Which actually
means it's used everywhere pretty much since we have no real non-legacy
userspace. And drm_atomic_replace_property_blob_from_id() would get used for
the non-legacy path as well, so it wouldn't really change anything as far as
this leak is concerned.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
You are the QA Contact for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx-bugs/attachments/20161025/04b2ca6e/attachment-0001.html>


More information about the intel-gfx-bugs mailing list