RaspberryPi4 Panic in net_ns_init()

Maxime Ripard maxime at cerno.tech
Wed Aug 31 14:42:05 UTC 2022


Hi,

Sorry for the fairly broad list of recipients, I'm not entirely sure
where the issue lies exactly, and it seems like multiple areas are
involved.

Martin reported me an issue discovered with the VC4 DRM driver that
would prevent the RaspberryPi4 from booting entirely. At boot, and
apparently before the console initialization, the board would just die.

It first appeared when both DYNAMIC_DEBUG and DRM_VC4 were built-in. We
started to look into what configuration would trigger it.

It looks like a good reproducer is:

ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- make -j18 defconfig mod2yesconfig
./scripts/config -e CONFIG_DYNAMIC_DEBUG
ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- make -j18 olddefconfig

If we enable earlycon, we end up with:

[    0.000000] Booting Linux on physical CPU 0x0000000000 [0x410fd083]
[    0.000000] Linux version 6.0.0-rc3 (max at houat) (aarch64-linux-gnu-gcc (GCC) 12.1.1 20220507 (Red Hat Cross 12.1.1-1), GNU ld version 2.37-7.fc36) #52 SMP PREEMPT Wed Aug 31 14:28:41 CEST 2022
[    0.000000] random: crng init done
[    0.000000] Machine model: Raspberry Pi 4 Model B Rev 1.1
[    0.000000] earlycon: uart8250 at MMIO32 0x00000000fe215040 (options '')
[    0.000000] printk: bootconsole [uart8250] enabled
[    0.000000] efi: UEFI not found.
[    0.000000] Reserved memory: bypass linux,cma node, using cmdline CMA params instead
[    0.000000] OF: reserved mem: node linux,cma compatible matching fail
[    0.000000] NUMA: No NUMA configuration found
[    0.000000] NUMA: Faking a node at [mem 0x0000000000000000-0x00000000fbffffff]
[    0.000000] NUMA: NODE_DATA [mem 0xfb815b40-0xfb817fff]
[    0.000000] Zone ranges:
[    0.000000]   DMA      [mem 0x0000000000000000-0x000000003fffffff]
[    0.000000]   DMA32    [mem 0x0000000040000000-0x00000000fbffffff]
[    0.000000]   Normal   empty
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000000000000-0x000000003b3fffff]
[    0.000000]   node   0: [mem 0x0000000040000000-0x00000000fbffffff]
[    0.000000] Initmem setup node 0 [mem 0x0000000000000000-0x00000000fbffffff]
[    0.000000] On node 0, zone DMA32: 19456 pages in unavailable ranges
[    0.000000] On node 0, zone DMA32: 16384 pages in unavailable ranges
[    0.000000] cma: Reserved 512 MiB at 0x000000000ee00000
[    0.000000] percpu: Embedded 21 pages/cpu s48040 r8192 d29784 u86016
[    0.000000] Detected PIPT I-cache on CPU0
[    0.000000] CPU features: detected: Spectre-v2
[    0.000000] CPU features: detected: Spectre-v3a
[    0.000000] CPU features: detected: Spectre-v4
[    0.000000] CPU features: detected: Spectre-BHB
[    0.000000] CPU features: detected: Kernel page table isolation (KPTI)
[    0.000000] CPU features: detected: ARM erratum 1742098
[    0.000000] CPU features: detected: ARM errata 1165522, 1319367, or 1530923
[    0.000000] Fallback order for Node 0: 0
[    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 996912
[    0.000000] Policy zone: DMA32
[    0.000000] Kernel command line: video=Composite-1:720x480 at 60i,margin_left=32,margin_right=32,margin_top=32,margin_bottom=32 dma.dmachans=0x37f5 bcm2709.boardrev=0xc03111 bcm2709.serial=0xb7f44626 bcm2709.uart_clock=48000000 bcm2709.disk_led_gpio=42 bcm2709.disk_led_active_low=0 smsc95xx.macaddr=DC:A6:32:0E:F7:01 vc_mem.mem_base=0x3ec00000 vc_mem.mem_size=0x40000000  root=/dev/nfs nfsroot=192.168.20.10:/srv/nfs/rpi/bullseye64 rw 8250.nr_uarts=1 cma=512M ip=dhcp console=ttyS0,115200 earlycon=uart8250,mmio32,0xfe215040
[    0.000000] Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes, linear)
[    0.000000] Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes, linear)
[    0.000000] mem auto-init: stack:all(zero), heap alloc:off, heap free:off
[    0.000000] software IO TLB: area num 4.
[    0.000000] software IO TLB: mapped [mem 0x0000000037400000-0x000000003b400000] (64MB)
[    0.000000] Memory: 3312220K/4050944K available (30656K kernel code, 5924K rwdata, 18912K rodata, 11584K init, 672K bss, 214436K reserved, 524288K cma-reserved)
[    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
[    0.000000] rcu: Preemptible hierarchical RCU implementation.
[    0.000000] rcu: 	RCU event tracing is enabled.
[    0.000000] rcu: 	RCU restricting CPUs from NR_CPUS=256 to nr_cpu_ids=4.
[    0.000000] 	Trampoline variant of Tasks RCU enabled.
[    0.000000] 	Tracing variant of Tasks RCU enabled.
[    0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 25 jiffies.
[    0.000000] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=4
[    0.000000] NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0
[    0.000000] Root IRQ handler: gic_handle_irq
[    0.000000] GIC: Using split EOI/Deactivate mode
[    0.000000] rcu: srcu_init: Setting srcu_struct sizes based on contention.
[    0.000000] arch_timer: cp15 timer(s) running at 54.00MHz (phys).
[    0.000000] clocksource: arch_sys_counter: mask: 0xffffffffffffff max_cycles: 0xc743ce346, max_idle_ns: 440795203123 ns
[    0.000001] sched_clock: 56 bits at 54MHz, resolution 18ns, wraps every 4398046511102ns
[    0.008648] Console: colour dummy device 80x25
[    0.013237] Calibrating delay loop (skipped), value calculated using timer frequency.. 108.00 BogoMIPS (lpj=216000)
[    0.023803] pid_max: default: 32768 minimum: 301
[    0.028540] LSM: Security Framework initializing
[    0.033252] Kernel panic - not syncing: Could not allocate generic netns
[    0.040026] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.0.0-rc3 #52
[    0.046363] Hardware name: Raspberry Pi 4 Model B Rev 1.1 (DT)
[    0.052255] Call trace:
[    0.054721]  dump_backtrace+0xe4/0x124
[    0.058525]  show_stack+0x1c/0x5c
[    0.061878]  dump_stack_lvl+0x64/0x80
[    0.065582]  dump_stack+0x1c/0x38
[    0.068932]  panic+0x170/0x328
[    0.072020]  net_ns_init+0x88/0x134
[    0.075548]  start_kernel+0x628/0x69c
[    0.079251]  __primary_switched+0xbc/0xc4
[    0.083311] ---[ end Kernel panic - not syncing: Could not allocate generic netns ]---

So it seems that net_alloc_generic() fails, and the only reason I could
see is if kzalloc() fails, so now I'm super confused.

It looks like the board has plenty (~3GB) of RAM available at boot, and
most importantly I don't see the relationship between a DRM driver,
DYNAMIC_DEBUG, and SLAB or the network namespace.

After a bit more experiments,

 * ./scripts/config -e CONFIG_DYNAMIC_DEBUG -d CONFIG_DRM_VC4 still has
   that panic, so it looks like VC4 itself isn't involved.

 * ./scripts/config -e CONFIG_DYNAMIC_DEBUG -d CONFIG_DRM works, so DRM
   seems to be involved somehow. It has a number of memory management
   dependencies, so it's probably a side effect of DRM being there.

 * make defconfig mod2yesconfig (so without DYNAMIC_DEBUG, with DRM)
   works too.

So it looks to me like there's indeed some interaction between DRM,
DYNAMIC_DEBUG, SLAB and/or the network namespace, but I'm not entirely
sure where to go from there. Any ideas?

Maxime


More information about the dri-devel mailing list