[Intel-gfx] 5.9-rc1: graphics regression moved from -next to mainline

Dave Airlie airlied at gmail.com
Wed Aug 19 01:12:50 UTC 2020


On Wed, 19 Aug 2020 at 10:38, Linus Torvalds
<torvalds at linux-foundation.org> wrote:
>
> Ping on this?
>
> The code disassembles to
>
>   24: 8b 85 d0 fd ff ff    mov    -0x230(%ebp),%eax
>   2a:* c7 03 01 00 40 10    movl   $0x10400001,(%ebx) <-- trapping instruction
>   30: 89 43 04              mov    %eax,0x4(%ebx)
>   33: 8b 85 b4 fd ff ff    mov    -0x24c(%ebp),%eax
>   39: 89 43 08              mov    %eax,0x8(%ebx)
>   3c: e9                    jmp ...
>
> which looks like is one of the cases in __reloc_entry_gpu(). I *think*
> it's this one:
>
>         } else if (gen >= 3 &&
>                    !(IS_I915G(eb->i915) || IS_I915GM(eb->i915))) {
>                 *batch++ = MI_STORE_DWORD_IMM | MI_MEM_VIRTUAL;
>                 *batch++ = addr;
>                 *batch++ = target_addr;
>
> where that "batch" pointer is 0xf8601000, so it looks like it just
> overflowed into the next page that isn't there.
>
> The cleaned-up call trace is
>
>   drm_ioctl+0x1f4/0x38b ->
>     drm_ioctl_kernel+0x87/0xd0 ->
>       i915_gem_execbuffer2_ioctl+0xdd/0x360 ->
>         i915_gem_do_execbuffer+0xaab/0x2780 ->
>           eb_relocate_vma
>
> but there's a lot of inling going on, so..
>
> The obvious suspect is commit 9e0f9464e2ab ("drm/i915/gem: Async GPU
> relocations only") but that's going purely by "that seems to be the
> main relocation change this mmrge window".

I think there's been some discussion about reverting that change for
other reasons, but it's quite likely the culprit.

Maybe we can push for a revert sooner, (cc'ing more of i915 team).

Dave.


More information about the Intel-gfx mailing list