[Bug 98420] New: Memory leak in drm_atomic.c eventually (few days) consuming all RAM (on at least one system configuration)
bugzilla-daemon at freedesktop.org
bugzilla-daemon at freedesktop.org
Mon Oct 24 22:14:17 UTC 2016
https://bugs.freedesktop.org/show_bug.cgi?id=98420
Bug ID: 98420
Summary: Memory leak in drm_atomic.c eventually (few days)
consuming all RAM (on at least one system
configuration)
Product: DRI
Version: unspecified
Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
Severity: major
Priority: medium
Component: DRM/Intel
Assignee: intel-gfx-bugs at lists.freedesktop.org
Reporter: felix.monninger at gmail.com
QA Contact: intel-gfx-bugs at lists.freedesktop.org
CC: intel-gfx-bugs at lists.freedesktop.org
Created attachment 127524
--> https://bugs.freedesktop.org/attachment.cgi?id=127524&action=edit
Patch to drm_atomic.c calling the required drm_property_unreference_blob's
Problem:
I noticed that RAM irretrievably gets lost over the course of a few days. "cat
/proc/meminfo |grep SUnreclaim" frequently (after 2 or 3 days of uptime) grew
to high amounts (>2GB) of unreclaimable SLABs until out-of-memory failure.
System:
Linux 4.7.4+; 4.8.* (problem has been in Kernel since at least Feb 16)
Intel Corporation 3rd Gen Core processor Graphics Controller (rev 09)
i915
Investigation:
1. /proc/slabinfo showed that lot of 4K memory blocks "kmalloc-4096" have been
allocated (visible after modifying slab.c to include those of size>4096 into
the output). Example line from /proc/slabinfo:
kmalloc-4096 11574 11578 4432 7 8 : tunables 0 0 0 :
slabdata 1654 1654 0
2. /proc/slab_allocators showed (again after modifying slab.c):
kmalloc-4096: 1954 drm_property_create_blob.part.19+0x27/0xe0 [drm]
(the numbers are from different times, in any case growing unboundedly until
reboot)
3. using ftrace revealed the following callstack being processed precisely
every .5 seconds:
drm_property_create_blob <- drm_atomic_helper_legacy_gamma_set
<-drm_mode_gamma_set_ioctl <- drm_ioctl
4. Reference counts on the "blob" allocated in
drm_atomic_helper.c:drm_atomic_helper_legacy_gamma_set increased by two is then
passed to after the line "ret = drm_atomic_crtc_set_property(crtc, crtc_state,
config->gamma_lut_property, blob->base.id);". (This should only be incremented
by 1 as ownership of this blob is passed to the crtc_state which then keeps the
blob as an updated property value.)
5. As the blob is passed by id ("..., blob->base.id)"), in
drm_atomic.c:drm_atomic_replace_property_blob_from_id the function
drm_property_lookup_blob(dev, blob_id); is called. Note that the function
manual says "If successful, this takes an additional reference to the blob
property. callers need to make sure to eventually unreference the returned
property again, using @drm_property_unreference_blob.", which is not being done
in this case.
6. This leads to the old state->degamma_lut that is replaced by the updated
blob never being freed (even after its refcount being properly decremented by 1
at the remaining places), since its refcount has been incremented once too
much. Thus every half second an 4K block is wasted.
Fix:
We call drm_property_unreference_blob(new_blob) at the appropriate spots in
drm_atomic.c:drm_atomic_replace_property_blob_from_id . Please see the attached
patch.
I wonder, which hardware is affected by this "legacy" code (i. e.
drm_atomic_helper_legacy_gamma_set)? Only older intel HD graphics (<4000?)
devices, or is it actually more widely used despite the legacy naming?
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are on the CC list for the bug.
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx-bugs/attachments/20161024/6b38e77c/attachment.html>
More information about the intel-gfx-bugs
mailing list