High memory usage by ModemManager
Aleksander Morgado
aleksander at aleksander.es
Wed Jan 8 09:00:51 UTC 2020
Hey
>
> 1. We are not actively resetting the modem anymore. We found that this
> only made matters worse. Essentially the kernel was not releasing the USB
> interfaces of the modem completely, which results in ModemManager not
> acquiring the right ports.
>
Hummm... did you have custom rules as MM port type hints? MM doesn't rely
on interface names for its hints, so it shouldn't matter when grabbing
ports.
> 2. We are now using mmcli to turn on the GPS. This works without any
> issues.
>
Good!
> 3. As you said, it is only possible to have 1 instance of ModemManager, I
> thought there were 3 separate instances, but this is not the case.
>
Yes.
> 4. We are not using qmicli ourselves to communicate with the modem via the
> QMI interface, all interactions are done by ModemManager.
>
Ok.
> 5. We have updated ModemManager to 1.12.2 and libqmi to 1.24.2
>
> The problem we see with ModemManager is that it is slowly using more and
> more memory while using the QMI interface of our Quectel BG96 modem.
>
I have another report from a company that I work with that shows the same
symptoms. I don't have any clear picture yet of what the issue could be,
because none of my valgrind tests have shown any relevant memory leak. If
you could get a proper valgrind log we could give it a look.
> We have done some more research and found the following things:
>
> Our system has 512MB RAM. The OOM killer kills ModemManager after 12 hours
> because at that time, it is using half of the memory available:
>
>
Ouch!!! Really?
> *[43036.640048] kworker/u2:1 invoked oom-killer:
> gfp_mask=0x14082c2(GFP_KERNEL|__GFP_HIGHMEM|__GFP_NOWARN|__GFP_ZERO),
> nodemask=(null), order=0, oom_score_adj=0*
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> *[43036.697846] kworker/u2:1 cpuset=/ mems_allowed=0[43036.701205] CPU: 0
> PID: 18834 Comm: kworker/u2:1 Not tainted 4.14.78 #1[43036.706521] Hardware
> name: Freescale i.MX6 UltraLite (Device Tree)[43036.711544] Workqueue:
> brcmf_wq/mmc0:0001:1 brcmf_sdio_dataworker [brcmfmac][43036.717326]
> [<8010f7b0>] (unwind_backtrace) from [<8010b2bc>]
> (show_stack+0x10/0x14)[43036.723780] [<8010b2bc>] (show_stack) from
> [<80a2cf5c>] (dump_stack+0x78/0x8c)[43036.729713] [<80a2cf5c>] (dump_stack)
> from [<801ceee8>] (dump_header+0xa0/0x230)[43036.735819] [<801ceee8>]
> (dump_header) from [<801ce1f8>] (oom_kill_process+0xb0/0x514)[43036.742444]
> [<801ce1f8>] (oom_kill_process) from [<801cec2c>]
> (out_of_memory+0x15c/0x31c)[43036.749329] [<801cec2c>] (out_of_memory) from
> [<801d3884>] (__alloc_pages_nodemask+0xd14/0xf34)[43036.756739]
> [<801d3884>] (__alloc_pages_nodemask) from [<802054a0>]
> (__vmalloc_node_range+0x114/0x230)[43036.764757] [<802054a0>]
> (__vmalloc_node_range) from [<80205858>]
> (__vmalloc_node.constprop.11+0x48/0x50)[43036.773030] [<80205858>]
> (__vmalloc_node.constprop.11) from [<802058c0>]
> (vzalloc+0x2c/0x34)[43036.780267] [<802058c0>] (vzalloc) from [<7f0369f0>]
> (brcmf_sdio_dataworker+0x1cbc/0x216c [brcmfmac])[43036.788286] [<7f0369f0>]
> (brcmf_sdio_dataworker [brcmfmac]) from [<80141298>]
> (process_one_work+0x1d8/0x410)[43036.796825] [<80141298>]
> (process_one_work) from [<80141f2c>]
> (worker_thread+0x50/0x598)[43036.803623] [<80141f2c>] (worker_thread) from
> [<80146d90>] (kthread+0x14c/0x154)[43036.809728] [<80146d90>] (kthread)
> from [<80107a48>] (ret_from_fork+0x14/0x2c)[43037.167843]
> Mem-Info:[43037.168859] active_anon:98149 inactive_anon:17916
> isolated_anon:0 active_file:12 inactive_file:25
> isolated_file:0 unevictable:0 dirty:0 writeback:0
> unstable:0 slab_reclaimable:1741 slab_unreclaimable:4386
> mapped:14080 shmem:33723 pagetables:936 bounce:0
> free:650 free_pcp:45 free_cma:0[43037.277821] Node 0 active_anon:392596kB
> inactive_anon:71664kB active_file:48kB inactive_file:52kB unevictable:0kB
> isolated(anon):0kB isolated(file):0kB mapped:56320kB dirty:0kB
> writeback:0kB shmem:134892kB writeback_tmp:0kB unstable:0kB
> all_unreclaimable? no[43037.367828] Normal free:2680kB min:2704kB
> low:3380kB high:4056kB active_anon:392596kB inactive_anon:71664kB
> active_file:48kB inactive_file:52kB unevictable:0kB writepending:0kB
> present:524288kB managed:503600kB mlocked:0kB kernel_stack:1696kB
> pagetables:3744kB bounce:0kB free_pcp:120kB local_pcp:120kB
> free_cma:0kB[43037.477797] lowmem_reserve[]: 0 0 0[43037.480024] Normal:
> 82*4kB (UME) 72*8kB (UME) 31*16kB (UM) 16*32kB (UM) 6*64kB (U) 3*128kB (U)
> 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB 0*8192kB 0*16384kB 0*32768kB =
> 2680kB[43037.541295] 33751 total pagecache pages[43037.543847] 0 pages in
> swap cache[43037.545863] Swap cache stats: add 0, delete 0, find
> 0/0[43037.587944] Free swap = 0kB[43037.589582] Total swap =
> 0kB[43037.591169] 131072 pages RAM[43037.592750] 0 pages
> HighMem/MovableOnly[43037.595286] 5172 pages reserved[43037.597126] 0 pages
> cma reserved[43037.617826] [ pid ] uid tgid total_vm rss nr_ptes
> nr_pmds swapents oom_score_adj name[43037.657894] [ 109] 0 109
> 7060 2073 50 0 0 0
> systemd-journal[43037.666094] [ 115] 0 115 2058 1069 8
> 0 0 -1000 systemd-udevd[43037.713088] [ 132] 1000
> 132 1695 1139 7 0 0 -900
> dbus-daemon[43037.743755] [ 133] 0 133 1896 1092 8
> 0 0 0 haveged[43037.787950] [ 135] 0 135
> 63780 53513 116 0 0 0
> ModemManager[43037.795893] [ 138] 0 138 1849 856 8
> 0 0 0 systemd-logind[43037.837831] [ 160] 0
> 160 635 74 5 0 0 0
> getty[43037.845169] [ 186] 0 186 859 582 6 0
> 0 0 modem.sh[43037.917936] [ 193] 1006 193 1971
> 845 8 0 0 0 systemd-network[43037.926136]
> [ 196] 0 196 17518 3188 23 0 0
> 0 NetworkManager[43037.971235] [ 244] 1007 244 2608 1599
> 10 0 0 0 systemd-resolve[43038.004486] [ 245]
> 0 245 1574 1221 6 0 0 0
> openvpn[43038.042983] [ 275] 0 275 826 600 6 0
> 0 0 led-control.sh[43038.077907] [ 276] 0 276
> 3115 469 8 0 0 0 chronyd[43038.085472]
> [ 304] 0 304 2073 1085 8 0 0
> 0 wpa_supplicant[43038.126229] [ 320] 0 320 6909 899
> 12 0 0 0 qmi-proxy[43038.177919] [ 366] 0
> 366 240369 9498 48 0 0 -500
> balena-engine-d[43038.186169] [ 386] 0 386 238336 7960 45
> 0 0 -500 balena-engine-c[43038.224253] [ 398] 0
> 398 1350 861 7 0 0 -1000
> sshd[43038.267853] [ 578] 0 578 225786 6921 35 0
> 0 -999 balena-engine-c[43038.276058] [ 585] 0 585
> 223993 7224 33 0 0 -999
> balena-engine-c[43038.333144] [ 587] 0 587 226042 7248 38
> 0 0 -999 balena-engine-c[43038.357927] [ 602] 0
> 602 223993 7040 34 0 0 -999
> balena-engine-c[43038.397439] [ 688] 0 688 1290 75 7
> 0 0 0 enqueue[43038.427860] [ 696] 0 696
> 42418 5093 104 0 0 0 node[43038.467867] [
> 702] 999 702 4385 206 9 0 0 0
> redis-server[43038.475915] [ 717] 0 717 32997 6244 106
> 0 0 0 datahub-stats[43038.527834] brcmfmac:
> brcmf_sdio_bus_rxctl: resumed on timeout[43038.532447] brcmfmac:
> _brcmf_set_multicast_list: Setting allmulti failed, -110[43038.538498] [
> 934] 0 934 826 551 6 0 0 0
> app-layer-updat[43038.538511] [ 935] 0 935 826 588 7
> 0 0 0 app-layer-healt[43038.538520] [11460] 0
> 11460 1380 185 7 0 0 0
> sleep[43038.538532] [ 6036] 0 6036 228091 6967 36 0
> 0 -999 balena-engine-c[43038.538542] [ 6063] 0 6063
> 29799 2932 74 0 0 0
> ivh2-msgs-deque[43038.538553] [18785] 0 18785 1380 184 7
> 0 0 0 sleep[43038.538563] [18901] 0 18901
> 213220 4860 24 0 0 0
> balena-engine[43038.538575] [18902] 0 18902 540 74 4
> 0 0 0 grep[43038.538584] [18907] 0 18907
> 826 321 5 0 0 0
> led-control.sh[43038.538639] Out of memory: Kill process 135 (ModemManager)
> score 413 or sacrifice child[43038.538685] Killed process 135
> (ModemManager) total-vm:255120kB, anon-rss:204780kB, file-rss:4kB,
> shmem-rss:9268kB[43039.110517] oom_reaper: reaped process 135
> (ModemManager), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB*
>
> After we found out that the OOM killer was killing ModemManager we ran
> ModemManager in debug mode using Memcheck from Valgrind to try to capture
> the memory leaks. See the log below
>
> *==2705== Memcheck, a memory error detector*
> *==2705== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.*
> *==2705== Using Valgrind-3.14.0 and LibVEX; rerun with -h for copyright
> info*
> *==2705== Command: /usr/sbin/ModemManager --debug
> --filter-policy=WHITELIST-ONLY*
> *==2705== Parent PID: 2598*
> *==2705==*
> *==2705== Warning: noted but unhandled ioctl 0x540c with no size/direction
> hints.*
> *==2705== This could cause spurious value errors to appear.*
> *==2705== See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a
> proper wrapper.*
> *==2756==*
> *==2756== HEAP SUMMARY:*
> *==2756== in use at exit: 237,156 bytes in 5,555 blocks*
> *==2756== total heap usage: 30,658 allocs, 25,103 frees, 8,699,112 bytes
> allocated*
> *==2756==*
> *==2756== 12 bytes in 1 blocks are indirectly lost in loss record 1 of 15*
> *==2756== at 0x4847D74: malloc (in
> /usr/lib/valgrind/vgpreload_memcheck-arm-linux.so)*
> *==2756==*
> *==2756== 32 bytes in 4 blocks are possibly lost in loss record 2 of 15*
> *==2756== at 0x4847D74: malloc (in
> /usr/lib/valgrind/vgpreload_memcheck-arm-linux.so)*
> *==2756==*
> *==2756== 32 bytes in 4 blocks are possibly lost in loss record 3 of 15*
> *==2756== at 0x4847C88: malloc (in
> /usr/lib/valgrind/vgpreload_memcheck-arm-linux.so)*
> *==2756==*
> *==2756== 52 (40 direct, 12 indirect) bytes in 4 blocks are definitely
> lost in loss record 4 of 15*
> *==2756== at 0x4847D74: malloc (in
> /usr/lib/valgrind/vgpreload_memcheck-arm-linux.so)*
> *==2756==*
> *==2756== 376 bytes in 4 blocks are possibly lost in loss record 5 of 15*
> *==2756== at 0x484AA44: realloc (in
> /usr/lib/valgrind/vgpreload_memcheck-arm-linux.so)*
> *==2756==*
> *==2756== 680 bytes in 16 blocks are possibly lost in loss record 6 of 15*
> *==2756== at 0x484A7C4: calloc (in
> /usr/lib/valgrind/vgpreload_memcheck-arm-linux.so)*
> *==2756==*
> *==2756== 688 bytes in 43 blocks are still reachable in loss record 7 of
> 15*
> *==2756== at 0x4DEF9E4: g_closure_new_simple (in
> /usr/lib/libgobject-2.0.so.0.5600.4)*
> *==2756==*
> *==2756== 928 bytes in 43 blocks are still reachable in loss record 8 of
> 15*
> *==2756== at 0x4DEF99C: g_closure_new_simple (in
> /usr/lib/libgobject-2.0.so.0.5600.4)*
> *==2756==*
> *==2756== 4,080 bytes in 2 blocks are definitely lost in loss record 9 of
> 15*
> *==2756== at 0x484A7C4: calloc (in
> /usr/lib/valgrind/vgpreload_memcheck-arm-linux.so)*
> *==2756==*
> *==2756== 8,684 bytes in 296 blocks are still reachable in loss record 10
> of 15*
> *==2756== at 0x4847C88: malloc (in
> /usr/lib/valgrind/vgpreload_memcheck-arm-linux.so)*
> *==2756==*
> *==2756== 8,700 bytes in 217 blocks are still reachable in loss record 11
> of 15*
> *==2756== at 0x4E0AF58: g_type_create_instance (in
> /usr/lib/libgobject-2.0.so.0.5600.4)*
> *==2756==*
> *==2756== 10,512 bytes in 217 blocks are still reachable in loss record 12
> of 15*
> *==2756== at 0x4E0AF9C: g_type_create_instance (in
> /usr/lib/libgobject-2.0.so.0.5600.4)*
> *==2756==*
> *==2756== 12,067 bytes in 116 blocks are still reachable in loss record 13
> of 15*
> *==2756== at 0x484AA44: realloc (in
> /usr/lib/valgrind/vgpreload_memcheck-arm-linux.so)*
> *==2756==*
> *==2756== 73,835 bytes in 3,023 blocks are still reachable in loss record
> 14 of 15*
> *==2756== at 0x4847D74: malloc (in
> /usr/lib/valgrind/vgpreload_memcheck-arm-linux.so)*
> *==2756==*
> *==2756== 93,754 bytes in 1,305 blocks are still reachable in loss record
> 15 of 15*
> *==2756== at 0x484A7C4: calloc (in
> /usr/lib/valgrind/vgpreload_memcheck-arm-linux.so)*
> *==2756==*
> *==2756== LEAK SUMMARY:*
> *==2756== definitely lost: 4,120 bytes in 6 blocks*
> *==2756== indirectly lost: 12 bytes in 1 blocks*
> *==2756== possibly lost: 1,120 bytes in 28 blocks*
> *==2756== still reachable: 209,168 bytes in 5,260 blocks*
> *==2756== of which reachable via heuristic:*
> *==2756== newarray : 1,976 bytes in 75
> blocks*
> *==2756== suppressed: 0 bytes in 0 blocks*
> *==2756==*
> *==2756== For counts of detected and suppressed errors, rerun with: -v*
> *==2756== ERROR SUMMARY: 6 errors from 6 contexts (suppressed: 0 from 0)*
> *==2705==*
> *==2705== HEAP SUMMARY:*
> *==2705== in use at exit: 874,364 bytes in 24,674 blocks*
> *==2705== total heap usage: 2,690,239 allocs, 2,665,565 frees,
> 812,938,818 bytes allocated*
> *==2705==*
> *==2705== 32 bytes in 4 blocks are possibly lost in loss record 1 of 14*
> *==2705== at 0x4847C88: malloc (in
> /usr/lib/valgrind/vgpreload_memcheck-arm-linux.so)*
> *==2705==*
> *==2705== 480 bytes in 4 blocks are possibly lost in loss record 2 of 14*
> *==2705== at 0x484AA44: realloc (in
> /usr/lib/valgrind/vgpreload_memcheck-arm-linux.so)*
> *==2705==*
> *==2705== 728 bytes in 10 blocks are possibly lost in loss record 3 of 14*
> *==2705== at 0x4847D74: malloc (in
> /usr/lib/valgrind/vgpreload_memcheck-arm-linux.so)*
> *==2705==*
> *==2705== 1,032 bytes in 18 blocks are possibly lost in loss record 4 of
> 14*
> *==2705== at 0x484A7C4: calloc (in
> /usr/lib/valgrind/vgpreload_memcheck-arm-linux.so)*
> *==2705==*
> *==2705== 1,744 bytes in 109 blocks are still reachable in loss record 5
> of 14*
> *==2705== at 0x4DEF9E4: g_closure_new_simple (in
> /usr/lib/libgobject-2.0.so.0.5600.4)*
> *==2705==*
> *==2705== 2,256 bytes in 109 blocks are still reachable in loss record 6
> of 14*
> *==2705== at 0x4DEF99C: g_closure_new_simple (in
> /usr/lib/libgobject-2.0.so.0.5600.4)*
> *==2705==*
> *==2705== 9,244 bytes in 364 blocks are still reachable in loss record 7
> of 14*
> *==2705== at 0x4847C88: malloc (in
> /usr/lib/valgrind/vgpreload_memcheck-arm-linux.so)*
> *==2705==*
> *==2705== 15,776 bytes in 470 blocks are still reachable in loss record 8
> of 14*
> *==2705== at 0x4E0AF9C: g_type_create_instance (in
> /usr/lib/libgobject-2.0.so.0.5600.4)*
> *==2705==*
> *==2705== 17,152 bytes in 229 blocks are still reachable in loss record 9
> of 14*
> *==2705== at 0x484AA44: realloc (in
> /usr/lib/valgrind/vgpreload_memcheck-arm-linux.so)*
> *==2705==*
> *==2705== 23,512 bytes in 470 blocks are still reachable in loss record 10
> of 14*
> *==2705== at 0x4E0AF58: g_type_create_instance (in
> /usr/lib/libgobject-2.0.so.0.5600.4)*
> *==2705==*
> *==2705== 78,223 bytes in 498 blocks are indirectly lost in loss record 11
> of 14*
> *==2705== at 0x4847D74: malloc (in
> /usr/lib/valgrind/vgpreload_memcheck-arm-linux.so)*
> *==2705==*
> *==2705== 93,063 bytes in 3,062 blocks are still reachable in loss record
> 12 of 14*
> *==2705== at 0x4847D74: malloc (in
> /usr/lib/valgrind/vgpreload_memcheck-arm-linux.so)*
> *==2705==*
> *==2705== 110,326 bytes in 1,844 blocks are still reachable in loss record
> 13 of 14*
> *==2705== at 0x484A7C4: calloc (in
> /usr/lib/valgrind/vgpreload_memcheck-arm-linux.so)*
> *==2705==*
> *==2705== 551,535 (473,312 direct, 78,223 indirect) bytes in 16,904 blocks
> are definitely lost in loss record 14 of 14*
> *==2705== at 0x4847D74: malloc (in
> /usr/lib/valgrind/vgpreload_memcheck-arm-linux.so)*
> *==2705==*
> *==2705== LEAK SUMMARY:*
> *==2705== definitely lost: 473,312 bytes in 16,904 blocks*
>
This is a really bad memory leak, indeed.
> *==2705== indirectly lost: 78,223 bytes in 498 blocks*
> *==2705== possibly lost: 2,272 bytes in 36 blocks*
> *==2705== still reachable: 273,073 bytes in 6,657 blocks*
> *==2705== of which reachable via heuristic:*
> *==2705== newarray : 5,276 bytes in 150
> blocks*
> *==2705== suppressed: 0 bytes in 0 blocks*
> *==2705==*
> *==2705== For counts of detected and suppressed errors, rerun with: -v*
> *==2705== ERROR SUMMARY: 5 errors from 5 contexts (suppressed: 0 from 0)*
>
> As you can see we are not able to get the stack trace to work with
> valgrind. The command we are currently using is:
>
> *systemctl stop ModemManager && G_SLICE=always-malloc valgrind
> --leak-check=full --show-leak-kinds=all --track-origins=yes
> --log-file=/data/valgrind.out /usr/bin/ModemManager --debug
> --filter-policy=WHITELIST-ONLY *
>
> Can you help us with getting the stack traces of the memory leaks that
> memcheck finds?
>
You need to build ModemManager (and if possible, glib) with debug symbols
enabled and install it in the system. Would you be able to do this?
--
Aleksander
https://aleksander.es
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/modemmanager-devel/attachments/20200108/cbd13e55/attachment-0001.htm>
More information about the ModemManager-devel
mailing list