[Intel-gfx] LOOKS GOOD: Possible regression in drm/i915 driver: memleak

Thu Dec 22 00:12:24 UTC 2022

On 20. 12. 2022. 20:34, Mirsad Todorovac wrote:
> On 12/20/22 16:52, Tvrtko Ursulin wrote:
> 
>> On 20/12/2022 15:22, srinivas pandruvada wrote:
>>> +Added DRM mailing list and maintainers
>>>
>>> On Tue, 2022-12-20 at 15:33 +0100, Mirsad Todorovac wrote:
>>>> Hi all,
>>>>
>>>> I have been unsuccessful to find any particular Intel i915 maintainer
>>>> emails, so my best bet is to post here, as you will must assuredly
>>>> already know them.
>>
>> For future reference you can use ${kernel_dir}/scripts/get_maintainer.pl -f ...
>>
>>>> The problem is a kernel memory leak that is repeatedly occurring
>>>> triggered during the execution of Chrome browser under the latest
>>>> 6.1.0+
>>>> kernel of this morning and Almalinux 8.6 on a Lenovo desktop box
>>>> with Intel(R) Core(TM) i5-8400 CPU @ 2.80GHz CPU.
>>>>
>>>> The build is with KMEMLEAK, KASAN and MGLRU turned on during the
>>>> build,
>>>> on a vanilla mainline kernel from Mr. Torvalds' tree.
>>>>
>>>> The leaks look like this one:
>>>>
>>>> unreferenced object 0xffff888131754880 (size 64):
>>>>     comm "chrome", pid 13058, jiffies 4298568878 (age 3708.084s)
>>>>     hex dump (first 32 bytes):
>>>>       01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>>> ................
>>>>       00 00 00 00 00 00 00 00 00 80 1e 3e 83 88 ff ff
>>>> ...........>....
>>>>     backtrace:
>>>>       [<ffffffff9e9b5542>] slab_post_alloc_hook+0xb2/0x340
>>>>       [<ffffffff9e9bbf5f>] __kmem_cache_alloc_node+0x1bf/0x2c0
>>>>       [<ffffffff9e8f767a>] kmalloc_trace+0x2a/0xb0
>>>>       [<ffffffffc08dfde5>] drm_vma_node_allow+0x45/0x150 [drm]
>>>>       [<ffffffffc0b33315>] __assign_mmap_offset_handle+0x615/0x820
>>>> [i915]
>>>>       [<ffffffffc0b34057>] i915_gem_mmap_offset_ioctl+0x77/0x110
>>>> [i915]
>>>>       [<ffffffffc08bc5e1>] drm_ioctl_kernel+0x181/0x280 [drm]
>>>>       [<ffffffffc08bc9cd>] drm_ioctl+0x2dd/0x6a0 [drm]
>>>>       [<ffffffff9ea54744>] __x64_sys_ioctl+0xc4/0x100
>>>>       [<ffffffff9fbc0178>] do_syscall_64+0x58/0x80
>>>>       [<ffffffff9fc000aa>] entry_SYSCALL_64_after_hwframe+0x72/0xdc
>>>>
>>>> The complete list of leaks in attachment, but they seem similar or
>>>> the same.
>>>>
>>>> Please find attached lshw and kernel build config file.
>>>>
>>>> I will probably check the same parms on my laptop at home, which is
>>>> also
>>>> Lenovo, but a different hw config and Ubuntu 22.10.
>>
>> Could you try the below patch?
>>
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>> index c3ea243d414d..0b07534c203a 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
>> @@ -679,9 +679,10 @@ mmap_offset_attach(struct drm_i915_gem_object *obj,
>>   insert:
>>          mmo = insert_mmo(obj, mmo);
>>          GEM_BUG_ON(lookup_mmo(obj, mmap_type) != mmo);
>> -out:
>> +
>>          if (file)
>>                  drm_vma_node_allow(&mmo->vma_node, file);
>> +out:
>>          return mmo;
>>
>>   err:
>>
>> Maybe it is not the best fix but curious to know if it will make the leak go away.
> 
> Hi,
> 
> After 27 minutes uptime with the patched kernel it looks promising.
> It is much longer than it took for the buggy kernel to leak slabs.
> 
> Here is the output:
> 
> [root at pc-mtodorov marvin]# echo scan > /sys/kernel/debug/kmemleak
> [root at pc-mtodorov marvin]# cat !$
> cat /sys/kernel/debug/kmemleak
> unreferenced object 0xffff888105028d80 (size 16):
>    comm "kworker/u12:5", pid 359, jiffies 4294902898 (age 1620.144s)
>    hex dump (first 16 bytes):
>      6d 65 6d 73 74 69 63 6b 30 00 00 00 00 00 00 00  memstick0.......
>    backtrace:
>      [<ffffffffb6bb5542>] slab_post_alloc_hook+0xb2/0x340
>      [<ffffffffb6bbbf5f>] __kmem_cache_alloc_node+0x1bf/0x2c0
>      [<ffffffffb6af8175>] __kmalloc_node_track_caller+0x55/0x160
>      [<ffffffffb6ae34a6>] kstrdup+0x36/0x60
>      [<ffffffffb6ae3508>] kstrdup_const+0x28/0x30
>      [<ffffffffb70d0757>] kvasprintf_const+0x97/0xd0
>      [<ffffffffb7c9cdf4>] kobject_set_name_vargs+0x34/0xc0
>      [<ffffffffb750289b>] dev_set_name+0x9b/0xd0
>      [<ffffffffc12d9201>] memstick_check+0x181/0x639 [memstick]
>      [<ffffffffb676e1d6>] process_one_work+0x4e6/0x7e0
>      [<ffffffffb676e556>] worker_thread+0x76/0x770
>      [<ffffffffb677b468>] kthread+0x168/0x1a0
>      [<ffffffffb6604c99>] ret_from_fork+0x29/0x50
> [root at pc-mtodorov marvin]# w
>   20:27:35 up 27 min,  2 users,  load average: 0.83, 1.15, 1.19
> USER     TTY      FROM             LOGIN@   IDLE   JCPU   PCPU WHAT
> marvin   tty2     tty2             20:01   27:10  10:12   2.09s /opt/google/chrome/chrome --type=utility --utility-sub-type=audio.m
> marvin   pts/1    -                20:01    0.00s  2:00   0.38s sudo bash
> [root at pc-mtodorov marvin]# uname -rms
> Linux 6.1.0-b6bb9676f216-mglru-kmemlk-kasan+ x86_64
> [root at pc-mtodorov marvin]#

As I hear no reply from Tvrtko, and there is already 1d5h uptime with no leaks (but
the kworker with memstick_check nag I couldn't bisect on the only box that reproduced it,
because something in hw was not supported in pre 4.16 kernels on the Lenovo V530S-07ICB.
Or I am doing something wrong.)

However, now I can find the memstick maintainers thanks to Tvrtko's hint.

If you no longer require my service, I would close this on my behalf.

I hope I did not cause too much trouble. The knowledgeable knew that this was not a security
risk, but only a bug. (30 leaks of 64 bytes each were hardly to exhaust memory in any realistic
time.)

However, having some experience with software development, I always preferred bugs reported
and fixed rather than concealed and lying in wait (or worse, found first by a motivated
adversary.) Forgive me this rant, I do not live from writing kernel drivers, this is just a
pet project as of time being ...

Thanks,
Mirsad

--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
-- 
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
The European Union