<html>
    <head>
      <base href="https://bugs.freedesktop.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - Memory leak in drm_atomic.c eventually (few days) consuming all RAM (on at least one system configuration)"
   href="https://bugs.freedesktop.org/show_bug.cgi?id=98420">98420</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>Memory leak in drm_atomic.c eventually (few days) consuming all RAM (on at least one system configuration)
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>DRI
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>unspecified
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>x86-64 (AMD64)
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Linux (All)
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>major
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>medium
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>DRM/Intel
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>intel-gfx-bugs@lists.freedesktop.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>felix.monninger@gmail.com
          </td>
        </tr>

        <tr>
          <th>QA Contact</th>
          <td>intel-gfx-bugs@lists.freedesktop.org
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>intel-gfx-bugs@lists.freedesktop.org
          </td>
        </tr></table>
      <p>
        <div>
        <pre>Created <span class=""><a href="attachment.cgi?id=127524" name="attach_127524" title="Patch to drm_atomic.c calling the required drm_property_unreference_blob's">attachment 127524</a> <a href="attachment.cgi?id=127524&action=edit" title="Patch to drm_atomic.c calling the required drm_property_unreference_blob's">[details]</a></span> <a href='page.cgi?id=splinter.html&bug=98420&attachment=127524'>[review]</a>
Patch to drm_atomic.c calling the required drm_property_unreference_blob's

Problem:
I noticed that RAM irretrievably gets lost over the course of a few days. "cat
/proc/meminfo |grep SUnreclaim" frequently (after 2 or 3 days of uptime) grew
to high amounts (>2GB) of unreclaimable SLABs until out-of-memory failure.

System:
Linux 4.7.4+; 4.8.* (problem has been in Kernel since at least Feb 16)
Intel Corporation 3rd Gen Core processor Graphics Controller (rev 09)
i915

Investigation:
1. /proc/slabinfo showed that lot of 4K memory blocks "kmalloc-4096" have been
allocated (visible after modifying slab.c to include those of size>4096 into
the output). Example line from /proc/slabinfo:
kmalloc-4096       11574  11578   4432    7    8 : tunables    0    0    0 :
slabdata   1654   1654      0
2. /proc/slab_allocators showed (again after modifying slab.c):
kmalloc-4096: 1954 drm_property_create_blob.part.19+0x27/0xe0 [drm]
(the numbers are from different times, in any case growing unboundedly until
reboot)
3. using ftrace revealed the following callstack being processed precisely
every .5 seconds:
drm_property_create_blob <- drm_atomic_helper_legacy_gamma_set
<-drm_mode_gamma_set_ioctl <- drm_ioctl
4. Reference counts on the "blob" allocated in
drm_atomic_helper.c:drm_atomic_helper_legacy_gamma_set increased by two is then
passed to after the line "ret = drm_atomic_crtc_set_property(crtc, crtc_state,
config->gamma_lut_property, blob->base.id);". (This should only be incremented
by 1 as ownership of this blob is passed to the crtc_state which then keeps the
blob as an updated property value.)
5. As the blob is passed by id ("..., blob->base.id)"), in
drm_atomic.c:drm_atomic_replace_property_blob_from_id the function
drm_property_lookup_blob(dev, blob_id); is called. Note that the function
manual says "If successful, this takes an additional reference to the blob
property. callers need to make sure to eventually unreference the returned
property again, using @drm_property_unreference_blob.", which is not being done
in this case.
6. This leads to the old state->degamma_lut that is replaced by the updated
blob never being freed (even after its refcount being properly decremented by 1
at the remaining places), since its refcount has been incremented once too
much. Thus every half second an 4K block is wasted.

Fix:
We call drm_property_unreference_blob(new_blob) at the appropriate spots in
drm_atomic.c:drm_atomic_replace_property_blob_from_id . Please see the attached
patch.

I wonder, which hardware is affected by this "legacy" code (i. e.
drm_atomic_helper_legacy_gamma_set)? Only older intel HD graphics (<4000?)
devices, or is it actually more widely used despite the legacy naming?</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are the QA Contact for the bug.</li>
          <li>You are on the CC list for the bug.</li>
          <li>You are the assignee for the bug.</li>
      </ul>
    </body>
</html>