[Intel-gfx] Enhance our tests with graphics memory issue detection

Shuang He shuang.he at intel.com
Tue Mar 31 04:12:29 CEST 2009


Since there're a lot of memory leak related issues happened in these 
days. We'd like to enhance our tests to fill the gap in our validation. 
Where're some thoughts come out after a few days' investigation.

In all, graphics related memory usage issue can fall into following 
categories:
    1.usual user space memory
        a. invalid access
        b. memory leak
    2. graphics memory
        a. invalid graphics memory access
        b. graphics memory leak
        c. over-all graphics memory manager policy

_Analysis:_
    1.[ab]:  usual malloc/mmap
    2.a:      valgrind will misreport many this kind of issues, since 
graphics memory is mapped by ioctl, which can't be detected by valgrind.
    2.b:      valgrind has no way to know if a graphics memory object is 
created or destroyed. GEM manages graphics memory objects by object 
handle and reference count.
    2.c:      It may impact user's experience. for example, if graphics 
memory manager takes too much system memory. It may impact system 
performance.

_Solutions:_
    1.[ab]:  covered by valgrind by default.
    2.a:      we can patch libdrm to notify valgrind about the graphics 
memory map/unmap.
    2.b:      we can patch libdrm to notify valgrind about the graphics 
memory creation/destruction.
    2.c:      There's no good way to track if  graphics takes too much 
memory. This check still need to be checked manually. Just do a bunch of 
operations, and checks memory usage is feasible.

_More considerations:_
    In order to catch those memory leaks, we need to run Xserver or 
2D/3D application executing various operation with valgrind. Since 
memory leak may not jump out itself. It only happens in some code path. 
In other words, we may not be possible to catch all those, but can try 
to make sure usual operation won't has this kind of issue.  And since X 
is a client/server model. so:
        1. We need to detect if client applications has graphics memory 
leak.
        2. We need to detect if XServer has memory leak: this is 
targeting operation like VT switch, xrandr operation
    The patch for libdrm have performance impact, so we can't always 
apply it. run X or App with valgrind is also much slower than usual. So 
we have to rebuilt libdrm every time before we want to detect graphics 
memory issue.

_Tests could be designed _(We could patch libdrm, and recover it after 
we're done in our daily test):
    daily test could be designed to catch 1.[ab], 2.[ab] targeting X 
server: we can do VT switch, various xrandr operation, XVideo.
    daily test could be designed to catch 1.[ab], 2.[ab] targeting 
application: we can run 3D apps
    manual test  could be done as P2 test in our RC cycle to catch 2.c
   
The patch of libdrm for testing is attached. You'll need valgrind-devel 
installed to compile it.

Any comments, or more ideas?


Thanks
    --Shuang

------------------------------------------------------------------------

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/intel-gfx/attachments/20090331/88c67ba2/attachment.html>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: drm.valgrind.patch
URL: <http://lists.freedesktop.org/archives/intel-gfx/attachments/20090331/88c67ba2/attachment.ksh>


More information about the Intel-gfx mailing list