Panic with bochs_drm module on qemu-system-sparc64

Fri Jun 30 04:33:52 UTC 2017

Hi all,

I'm one of the QEMU SPARC maintainers and I've been investigating why
enabling the fb console via the bochs_drm module causes a panic on
startup. The reproducer with QEMU 2.9 is easy:

$ ./qemu-system-sparc64 -m 512 -kernel rel-sparc/vmlinux -append
'console=ttyS0' -serial stdio

This gives the following panic on the serial console:

[   14.759388] [drm] Found bochs VGA, ID 0xb0c5.
[   14.760018] [drm] Framebuffer size 16384 kB @ 0x1ff01000000, mmio @
0x1ff02000000.
[   14.763370] [TTM] Zone  kernel: Available graphics memory: 252808 kiB
[   14.764240] [TTM] Initializing pool allocator
[   14.894178] Unable to handle kernel paging request at virtual address
000001ff01000000
[   14.894247] tsk->{mm,active_mm}->context = 0000000000000000
[   14.894308] tsk->{mm,active_mm}->pgd = fffff80000402000
[   14.894372]               \|/ ____ \|/
[   14.894372]               "@'/ .. \`@"
[   14.894372]               /_| \__/ |_\
[   14.894372]                  \__U_/
[   14.894435] swapper/0(1): Oops [#1]
[   14.895400] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.11.0-rc7+ #45
[   14.895634] task: fffff8001c097800 task.stack: fffff8001c09c000
[   14.895722] TSTATE: 0000000080001607 TPC: 00000000006f3c24 TNPC:
00000000006f3c30 Y: 00000000    Not tainted
[   14.896697] TPC: <sys_imageblit+0x1dc/0x438>
[   14.896916] g0: 0000000000000080 g1: fffff8001c360000 g2:
0000000000000000 g3: 0000000000000007
[   14.896976] g4: fffff8001c097800 g5: fffff8001ecea000 g6:
fffff8001c09c000 g7: 000000000090dfc0
[   14.897036] o0: 0000000000000026 o1: 0000000000000001 o2:
000000000000000f o3: 000001ff01000000
[   14.897094] o4: 0000000000000007 o5: 0000000000000008 sp:
fffff8001c09e0a1 ret_pc: 0000000000000001
[   14.897551] RPC: <0x1>
[   14.897715] l0: 0000000000000020 l1: 0000000000a43800 l2:
0000000000000080 l3: 0000000000000001
[   14.897773] l4: 0000000000b93800 l5: 0000000000000080 l6:
0000000000000030 l7: 0000000000000000
[   14.897830] i0: fffff8001c32c000 i1: fffff8001c360000 i2:
0000000000000000 i3: 0000000000000000
[   14.897887] i4: 0000000000000001 i5: 000001ff01000000 i6:
fffff8001c09e151 i7: 00000000007381d4
[   14.897984] I7: <drm_fb_helper_sys_imageblit+0x14/0x34>
[   14.898166] Call Trace:
[   14.898442]  [00000000007381d4] drm_fb_helper_sys_imageblit+0x14/0x34
[   14.898561]  [00000000006e6c9c] soft_cursor+0x174/0x19c
[   14.898601]  [00000000006e6744] bit_cursor+0x45c/0x490
[   14.898641]  [00000000006e324c] fbcon_cursor+0x16c/0x17c
[   14.898685]  [0000000000710f60] hide_cursor+0x2c/0xa8
[   14.898724]  [00000000007120fc] redraw_screen+0xc4/0x208
[   14.898765]  [00000000006e2328] fbcon_prepare_logo+0x288/0x358
[   14.898803]  [00000000006e27cc] fbcon_init+0x3d4/0x448
[   14.898844]  [00000000007113e4] visual_init+0xa4/0x100
[   14.898884]  [0000000000712ba0] do_bind_con_driver+0x1c8/0x300
[   14.898925]  [000000000071304c] do_take_over_console+0x170/0x198
[   14.898965]  [00000000006e28c4] do_fbcon_takeover+0x84/0xe8
[   14.899017]  [00000000004747dc] notifier_call_chain+0x38/0x74
[   14.899061]  [0000000000474a5c] __blocking_notifier_call_chain+0x28/0x44
[   14.899104]  [00000000006ec214] register_framebuffer+0x2b8/0x2ec
[   14.899147]  [00000000007399f0] drm_fb_helper_initial_config+0x2d0/0x36c
[   14.899294] Disabling lock debugging due to kernel taint
[   14.899551] Caller[00000000007381d4]:
drm_fb_helper_sys_imageblit+0x14/0x34
[   14.899656] Caller[00000000006e6c9c]: soft_cursor+0x174/0x19c
[   14.899696] Caller[00000000006e6744]: bit_cursor+0x45c/0x490
[   14.899735] Caller[00000000006e324c]: fbcon_cursor+0x16c/0x17c
[   14.899774] Caller[0000000000710f60]: hide_cursor+0x2c/0xa8
[   14.899812] Caller[00000000007120fc]: redraw_screen+0xc4/0x208
[   14.899852] Caller[00000000006e2328]: fbcon_prepare_logo+0x288/0x358
[   14.899891] Caller[00000000006e27cc]: fbcon_init+0x3d4/0x448
[   14.899930] Caller[00000000007113e4]: visual_init+0xa4/0x100
[   14.899970] Caller[0000000000712ba0]: do_bind_con_driver+0x1c8/0x300
[   14.900018] Caller[000000000071304c]: do_take_over_console+0x170/0x198
[   14.900061] Caller[00000000006e28c4]: do_fbcon_takeover+0x84/0xe8
[   14.900132] Caller[00000000004747dc]: notifier_call_chain+0x38/0x74
[   14.900218] Caller[0000000000474a5c]:
__blocking_notifier_call_chain+0x28/0x44
[   14.900263] Caller[00000000006ec214]: register_framebuffer+0x2b8/0x2ec
[   14.900306] Caller[00000000007399f0]:
drm_fb_helper_initial_config+0x2d0/0x36c
[   14.900351] Caller[0000000000761e60]: bochs_fbdev_init+0x6c/0xb0
[   14.900389] Caller[0000000000760b2c]: bochs_load+0x84/0xa8
[   14.900439] Caller[0000000000741c88]: drm_dev_register+0x114/0x1e8
[   14.900628] Caller[0000000000742a10]: drm_get_pci_dev+0xa8/0x118
[   14.900672] Caller[00000000006c8364]: pci_device_probe+0x70/0xdc
[   14.900713] Caller[0000000000768cf0]: driver_probe_device+0x148/0x2a4
[   14.900752] Caller[0000000000768ec4]: __driver_attach+0x78/0xa8
[   14.900790] Caller[00000000007674ec]: bus_for_each_dev+0x58/0x7c
[   14.900830] Caller[00000000007683bc]: bus_add_driver+0xd0/0x1fc
[   14.900869] Caller[0000000000769960]: driver_register+0xa8/0x100
[   14.900913] Caller[0000000000426cb0]: do_one_initcall+0x80/0x10c
[   14.900999] Caller[0000000000ad6bdc]: kernel_init_freeable+0x1a8/0x244
[   14.901037] Caller[00000000008be94c]: kernel_init+0x4/0xfc
[   14.901077] Caller[0000000000406064]: ret_from_fork+0x1c/0x2c

Looking at this in more detail we can see that the panic occurs when we
first touch the framebuffer memory as part of sys_imageblit() called via
drm_fb_helper_sys_imageblit() and it's caused by sys_imageblit()
dereferencing a pointer to write to the mapped framebuffer.

The bochs_drm driver itself uses a standard approach to map the
framebuffer like this in drivers/gpu/drm/bochs/bochs_hw.c (shortened for
clarity):

  addr = pci_resource_start(pdev, 0);
  size = pci_resource_len(pdev, 0);
  ...
  ...
  bochs->fb_map = ioremap(addr, size);
  if (bochs->fb_map == NULL) {
      DRM_ERROR("Cannot map framebuffer\n");
      return -ENOMEM;
  }

The issue with SPARC64 systems is that the address returned by ioremap()
is actually a physical address as per this comment in
arch/sparc/include/asm/io_64.h:

  /* On sparc64 we have the whole physical IO address space accessible
   * using physically addressed loads and stores, so this does nothing.
   */
  static inline void __iomem *ioremap(unsigned long offset, unsigned
  long size)
  {
      return (void __iomem *)offset;
  }

This means that unless accesses to the mapped framebuffer are done using
the standard readb/writeb/readw/writew/readl/writel functions which
force physical accesses bypassing the MMU, we end up accessing an
invalid unmapped address.

And it's evident that the code in drivers/video/fbdev/core/sysimgblt.c
doesn't use these accessor functions at all but dereferences the mapped
framebuffer pointer directly, hence causing the panic.

So I can see there are 2 potential issues here:

1) sys_imageblit() shouldn't be accessing ioremap()ped memory by
dereferencing a pointer

2) sys_imageblit() requires a virtual address while
drm_fb_helper_sys_imageblit() incorrectly assumes that any ioremap()ped
address is always virtual and passes it directly through

Note for LKML people: the list is high volume for me and so I'm not
subscribed, so please CC me directly on any reply.

Many thanks,

Mark.