Linux 4.8-rc1 Cirrus QEMU crashes on boot.

Linus Torvalds torvalds at linux-foundation.org
Mon Aug 8 21:19:23 UTC 2016


On Mon, Aug 8, 2016 at 1:24 PM, Eric W. Biederman <ebiederm at xmission.com> wrote:
>
> Booting up v4.8-rc1 in qemu today I ran I ran into this beautiful oops.
>
> I am just about to head out the door on vacation until the end of the
> month so hopefully I am tossing out enough information to the right
> people.
>
> I have confirmed that disabling CONFIG_DRM_CIRRUS_QEMU avoids this
> crash.
>
> [    0.720167] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
> [    0.721348] IP: [<ffffffff8d5ef9ac>] drm_pick_crtcs+0xfc/0x270
> [    0.722249] PGD 0
> [    0.722659] Oops: 0000 [#1] SMP
> [    0.723141] Modules linked in:
> [    0.723643] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.8.0-rc1+ #116
> [    0.724602] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
> [    0.725450] task: ffff8800bbf28000 task.stack: ffff8800bbf30000
> [    0.726327] RIP: 0010:[<ffffffff8d5ef9ac>]  [<ffffffff8d5ef9ac>] drm_pick_crtcs+0xfc/0x270
> [    0.727648] RSP: 0000:ffff8800bbf33978  EFLAGS: 00010217
> [    0.728444] RAX: ffffffff8d623a20 RBX: 0000000000000000 RCX: ffff8800baf3f860
> [    0.729498] RDX: 0000000000000000 RSI: 0000000000001000 RDI: ffff8800baf3f800
> [    0.730626] RBP: ffff8800bbf339f8 R08: 0000000000000000 R09: 0000000000000001
> [    0.731704] R10: 0000000000000001 R11: 0000000000000000 R12: ffff8800bbaf3af0
> [    0.732827] R13: ffff8800bbaf3af8 R14: 0000000000000000 R15: ffff8800baf3fc00
> [    0.733863] FS:  0000000000000000(0000) GS:ffff8800bf800000(0000) knlGS:0000000000000000
> [    0.735106] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    0.735964] CR2: 0000000000000018 CR3: 000000003b219000 CR4: 00000000000006f0
> [    0.737021] Stack:
> [    0.737342]  ffff8800bbf33998 0000000000000246 ffff8800bbaf3b18 ffff8800bbaf3b00
> [    0.738602]  0000100000000001 0000000000000000 ffff8800bbaf3b00 ffff8800baf3f800
> [    0.739807]  0000000300001000 ffff8800bbaf3b18 ffff8800bbf339e8 ffff8800baf3fc00
> [    0.740993] Call Trace:
> [    0.741393]  [<ffffffff8d5efedd>] drm_setup_crtcs+0x3bd/0xb50
> [    0.742252]  [<ffffffff8d0edaa1>] ? mark_held_locks+0x71/0x90
> [    0.743176]  [<ffffffff8db0db6b>] ? __mutex_unlock_slowpath+0xeb/0x1c0
> [    0.744138]  [<ffffffff8d5f08c1>] drm_fb_helper_initial_config+0x81/0x3c0
> [    0.745166]  [<ffffffff8d6110b4>] ? drm_modeset_unlock_all+0x54/0x80
> [    0.746136]  [<ffffffff8d624fe0>] cirrus_fbdev_init+0xa0/0xb0
> [    0.747033]  [<ffffffff8d6245fb>] cirrus_modeset_init+0x18b/0x1e0
> .. snip snip ..
> [    0.771023] Code: e8 ba e1 ff ff 48 8b 55 b8 48 83 f8 01 83 5d c4 ff 48 8b 7d b8 48 8b 82 98 02 00 00 49 8b 57 08 48 8b 40 10 48 8b 92 00 0a 00 00 <48> 83 7a 18 00 74 09 48 85 c0 0f 84 53 01 00 00 ff d0 49 89 c3

This decodes to

   0: e8 ba e1 ff ff       callq  ...
   5: 48 8b 55 b8           mov    -0x48(%rbp),%rdx
   9: 48 83 f8 01           cmp    $0x1,%rax
   d: 83 5d c4 ff           sbbl   $0xffffffff,-0x3c(%rbp)
  11: 48 8b 7d b8           mov    -0x48(%rbp),%rdi
  15: 48 8b 82 98 02 00 00 mov    0x298(%rdx),%rax
  1c: 49 8b 57 08           mov    0x8(%r15),%rdx
  20: 48 8b 40 10           mov    0x10(%rax),%rax
  24: 48 8b 92 00 0a 00 00 mov    0xa00(%rdx),%rdx
  2b:* 48 83 7a 18 00       cmpq   $0x0,0x18(%rdx) <-- trapping instruction
  30: 74 09                 je     0x3b
  32: 48 85 c0             test   %rax,%rax
  35: 0f 84 53 01 00 00     je     0x18e
  3b: ff d0                 callq  *%rax
  3d: 49 89 c3             mov    %rax,%r11


and trying to find matching code I think the oops is from this:

        if (fb_helper->dev->mode_config.funcs->atomic_commit &&
            !connector_funcs->best_encoder)
                encoder = drm_atomic_helper_best_encoder(connector);
        else
                encoder = connector_funcs->best_encoder(connector);

in particular, the trapping instruction is the load of "atomic_commit".

(The "callq *%rax" seems to be the "connector_funcs->best_encoder()" call)

So I *think* we have "fb_helper->dev->mode_config.funcs" being NULL.

But I might have gotten the code matching wrong. I don't have the same
compiler as Eric, so even when using his config my code generation
doesn't really match. But there is only one indirect call in that
function, which is that ->best_encoder() call, so it does seem to
match up.

Adding Boris Brezillon and Daniel Vetter to the cc, since assuming I'm
right, the main suspect is commit c61b93fe51b1 ("drm/atomic: Fix
remaining places where !funcs->best_encoder is valid").

Boris? Daniel?

              Linus


More information about the dri-devel mailing list