[Beignet] Unrecoverable system lockup when allocating too much memory

Zou, Nanhai nanhai.zou at intel.com
Wed Nov 4 18:50:04 PST 2015


Would this be better if you turn off the overcommit via proc fs?

Thanks
Zou Nanhai

From: Beignet [mailto:beignet-bounces at lists.freedesktop.org] On Behalf Of Lorenzo Pistone
Sent: Thursday, November 05, 2015 12:22 AM
To: beignet at lists.freedesktop.org
Subject: [Beignet] Unrecoverable system lockup when allocating too much memory

Hello,
in the ArrayFire test suite (https://github.com/arrayfire/arrayfire/) there is a program that tries to allocate one terabyte of cl memory. This is supposed to fail, but on beignet 1.1.1 this freezes the system completely, to the point that it is unresponsive to magic sys-rq keys. Here's the dmesg:
[ 1113.589788] Unable to purge GPU memory due lock contention.
[ 1113.592389] Xorg invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
[ 1113.592399] Xorg cpuset=/ mems_allowed=0
[ 1113.592408] CPU: 2 PID: 1021 Comm: Xorg Not tainted 4.2.3-200.fc22.x86_64 #1
[ 1113.592411] Hardware name: LENOVO 2306CTO/2306CTO, BIOS G2ETA4WW (2.64 ) 04/09/2015
[ 1113.592414]  0000000000000000 0000000098a7f49e ffff8800c1457968 ffffffff8177220a
[ 1113.592421]  0000000000000000 ffff8800bee20000 ffff8800c14579f8 ffffffff8177174e
[ 1113.592427]  ffffffffc14579d8 fffeefff00000000 0000000000000001 0000000000000003
[ 1113.592433] Call Trace:
[ 1113.592441]  [<ffffffff8177220a>] dump_stack+0x45/0x57
[ 1113.592444]  [<ffffffff8177174e>] dump_header+0x86/0x207
[ 1113.592451]  [<ffffffff811a627b>] oom_kill_process+0x1db/0x3a0
[ 1113.592455]  [<ffffffff811a69ec>] out_of_memory+0x54c/0x5a0
[ 1113.592459]  [<ffffffff811ac9d8>] __alloc_pages_nodemask+0x838/0x980
[ 1113.592464]  [<ffffffff811f4f71>] alloc_pages_current+0x91/0x100
[ 1113.592468]  [<ffffffff811a2a0b>] __page_cache_alloc+0xab/0xc0
[ 1113.592473]  [<ffffffff811a4c34>] filemap_fault+0x154/0x410
[ 1113.592478]  [<ffffffff811d10de>] __do_fault+0x4e/0xf0
[ 1113.592482]  [<ffffffff811d6218>] handle_mm_fault+0xf58/0x17d0
[ 1113.592488]  [<ffffffff810acda4>] ? signal_setup_done+0x74/0xc0
[ 1113.592494]  [<ffffffff81065447>] __do_page_fault+0x197/0x400
[ 1113.592499]  [<ffffffff810656df>] do_page_fault+0x2f/0x80
2*4096kB (M) [ 1113.592503]  [<ffffffff8177ab78>] page_fault+0x28/0x30
[ 1113.592506] Mem-Info:
= 14128kB
[ 1113.592593] Node 0 [ 1113.592512] active_anon:47207 inactive_anon:813455 isolated_anon:0
[ 1113.592512]  active_file:35<file:///\\35> inactive_file:49<file:///\\49> isolated_file:0<file:///\\0>
[ 1113.592512]  unevictable:8 dirty:0 writeback:0 unstable:0
[ 1113.592512]  slab_reclaimable:9502 slab_unreclaimable:10161
[ 1113.592512]  mapped:3095 shmem:797814 pagetables:3923 bounce:0
[ 1113.592512]  free:5677 free_pcp:239 free_cma:0
DMA32: [ 1113.592518] Node 0 1606*4kB (UEM) DMA free:14128kB min:32kB low:40kB high:48kB active_anon:8kB inactive_anon:600kB active_file:0kB<file:///\\0kB> inactive_file:0kB<file:///\\0kB> unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15984kB managed:15900kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:308kB slab_reclaimable:140kB slab_unreclaimable:200kB kernel_stack:16kB pagetables:8kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[ 1113.592527] lowmem_reserve[]:175*8kB  0 3101 3525(M)  35250*16kB 0*32kB 0*64kB
[ 1113.592535] Node 0 0*128kB 0*256kB 0*512kB DMA32 free:7824kB min:6572kB low:8212kB high:9856kB active_anon:167080kB inactive_anon:2878908kB active_file:128kB<file:///\\128kB> inactive_file:184kB<file:///\\184kB> unevictable:32kB isolated(anon):0kB isolated(file):0kB present:3258408kB managed:3178976kB mlocked:32kB dirty:0kB writeback:0kB mapped:10932kB shmem:2834484kB slab_reclaimable:29864kB slab_unreclaimable:32436kB kernel_stack:2624kB pagetables:13472kB unstable:0kB bounce:0kB free_pcp:732kB local_pcp:248kB free_cma:0kB writeback_tmp:0kB pages_scanned:744764 all_unreclaimable? yes
0*1024kB [ 1113.592543] lowmem_reserve[]: 00*2048kB 0*4096kB = 7824kB
[ 1113.592613] Node 0 Normal: 163*4kB (UM) 13*8kB [ 1113.592778] [  852]     0   852   128873      538      69       4        0             0 NetworkManager
[ 1113.592784] [  929]     0   929    53368      335      61       3        0             0 cupsd
[ 1113.592789] [  943]     0   943     5950       46      16       3        0             0 atd
[ 1113.592794] [  944]     0   944    31444      155      17       3        0             0 crond
[ 1113.592800] [  948]     0   948    86544      305      39       4        0             0 lightdm
[ 1113.592805] [ 1021]     0  1021    69001     7677     105       3        0             0 Xorg
[ 1113.592810] [ 1028]     0  1028    52244      241      57       3        0             0 abrt-watch-log
[ 1113.592815] [ 1030]     0  1030   104359      310     117       3        0             0 abrt-dump-journ
[ 1113.592821] [ 1038]     0  1038    12870      185      31       3        0             0 wpa_supplicant
[ 1113.592826] [ 1040]   992  1040   101756      517      52       3        0             0 colord
[ 1113.592832] [ 1195]   989  1195    11249      167      27       3        0             0 systemd
 0 424 424
[ 1113.592550] Node 0 Normal free:756kB min:896kB low:1120kB high:1344kB active_anon:21740kB inactive_anon:374312kB active_file:12kB<file:///\\12kB> inactive_file:12kB<file:///\\12kB> unevictable:0kB isolated(anon):0kB isolated(file):0kB present:497664kB managed:434340kB mlocked:0kB dirty:0kB writeback:0kB mapped:1448kB shmem:356464kB slab_reclaimable:8004kB slab_unreclaimable:8008kB kernel_stack:1232kB pagetables:2212kB unstable:0kB bounce:0kB free_pcp:224kB local_pcp:124kB free_cma:0kB writeback_tmp:0kB pages_scanned:70740 all_unreclaimable? yes
[ 1113.592558] lowmem_reserve[]: 0 0 0 0
[ 1113.592565] Node 0 DMA: 14*4kB (UEM) 17*8kB (UEM) 7*16kB (UM) 8*32kB (UM) 2*64kB (UM) 3*128kB (UEM) 3*256kB (UEM) 0*512kB 2*1024kB (EM) 1*2048kB (E) [ 1118.642337] Unable to purge GPU memory due lock contention.
[ 1123.656548] Unable to purge GPU memory due lock contention.
[ 1128.663802] Unable to purge GPU memory due lock contention.
[ 1133.669115] Unable to purge GPU memory due lock contention.
[ 1138.685468] Unable to purge GPU memory due lock contention.
[ 1143.903918] Unable to purge GPU memory due lock contention.
[ 1143.951596] sysrq: SysRq : Kill All Tasks
[ 1149.058252] Unable to purge GPU memory due lock contention.
[ 1154.065607] Unable to purge GPU memory due lock contention.
[ 1159.088920] Unable to purge GPU memory due lock contention.
[ 1164.094357] Unable to purge GPU memory due lock contention.
This can be triggered without privileges on the local system, so I guess it's also a DoS.

Regards.
Lorenzo Pistone
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/beignet/attachments/20151105/86a312d1/attachment-0001.html>


More information about the Beignet mailing list