<html>
    <head>
      <base href="https://bugs.freedesktop.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - [amdgpu][CIK] cp queue preemption time out, BUG: kernel NULL pointer dereference, address: 0000000000000038"
   href="https://bugs.freedesktop.org/show_bug.cgi?id=111021">111021</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>[amdgpu][CIK] cp queue preemption time out, BUG: kernel NULL pointer dereference, address: 0000000000000038
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>DRI
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>unspecified
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>x86-64 (AMD64)
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Linux (All)
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>medium
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>DRM/AMDgpu
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>dri-devel@lists.freedesktop.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>erhard_f@mailbox.org
          </td>
        </tr></table>
      <p>
        <div>
        <pre>Created <span class=""><a href="attachment.cgi?id=144678" name="attach_144678" title="kernel .dmesg (5.2-rc6)">attachment 144678</a> <a href="attachment.cgi?id=144678&action=edit" title="kernel .dmesg (5.2-rc6)">[details]</a></span>
kernel .dmesg (5.2-rc6)

[...]
[  440.685185] cp queue preemption time out
[  440.685338] Resetting wave fronts (nocpsch) on dev 00000000feee3825
[  440.685426] BUG: kernel NULL pointer dereference, address: 0000000000000038
[  440.685432] #PF: supervisor read access in kernel mode
[  440.685436] #PF: error_code(0x0000) - not-present page
[  440.685440] PGD 0 P4D 0 
[  440.685448] Oops: 0000 [#1] SMP NOPTI
[  440.685455] CPU: 3 PID: 1026 Comm: xmr-stak Not tainted 5.2.0-rc6 #1
[  440.685459] Hardware name: System manufacturer System Product Name/M5A78L-M
LX3, BIOS 1401    05/05/2016
[  440.685610] RIP: 0010:amdgpu_ib_schedule+0x4b/0x520 [amdgpu]
[  440.685616] Code: 89 f5 49 89 ff 48 89 54 24 08 0f b6 87 38 04 00 00 48 85
c9 0f 84 5d 03 00 00 48 8b 91 b0 00 00 00 48 89 54 24 10 48 8b 51 10 <48> 8b 52
38 48 89 14 24 84 c0 0f 84 09 e2 17 00 48 83 7c 24 10 00
[  440.685621] RSP: 0018:ffffac368c2a7ad0 EFLAGS: 00010286
[  440.685626] RAX: 0000000000000001 RBX: ffff97d66533dc00 RCX:
ffff97d66533dc00
[  440.685630] RDX: 0000000000000000 RSI: 0000000000000001 RDI:
ffff97d685fe7d48
[  440.685634] RBP: 0000000000000001 R08: ffffac368c2a7b48 R09:
0000000000000001
[  440.685638] R10: 0000000000000000 R11: 0000000000000001 R12:
0000000000000007
[  440.685642] R13: 0000000000ffd000 R14: ffff97d685fe0000 R15:
ffff97d685fe7d48
[  440.685647] FS:  00007f2115109700(0000) GS:ffff97d6a6ac0000(0000)
knlGS:0000000000000000
[  440.685651] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  440.685655] CR2: 0000000000000038 CR3: 00000003e4236000 CR4:
00000000000406e0
[  440.685659] Call Trace:
[  440.685669]  ? rcu_read_lock_sched_held+0x50/0x60
[  440.685807]  amdgpu_amdkfd_submit_ib+0xb6/0x170 [amdgpu]
[  440.685949]  deallocate_vmid.isra.12+0xe4/0xf0 [amdgpu]
[  440.686091]  destroy_queue_nocpsch_locked+0x176/0x190 [amdgpu]
[  440.686233]  process_termination_nocpsch+0x5e/0x130 [amdgpu]
[  440.686373]  kfd_process_dequeue_from_all_devices+0x36/0x50 [amdgpu]
[  440.686512]  kfd_process_notifier_release+0xf4/0x180 [amdgpu]
[  440.686519]  __mmu_notifier_release+0x65/0x110
[  440.686527]  exit_mmap+0x3b/0x170
[  440.686534]  mmput+0x45/0x110
[  440.686539]  do_exit+0x27d/0xb90
[  440.686546]  ? find_held_lock+0x2d/0x90
[  440.686551]  ? get_signal+0xcc/0xaa0
[  440.686556]  do_group_exit+0x42/0xb0
[  440.686561]  get_signal+0x119/0xaa0
[  440.686568]  do_signal+0x3e/0x620
[  440.686574]  ? find_held_lock+0x2d/0x90
[  440.686580]  exit_to_usermode_loop+0x4b/0xa0
[  440.686585]  do_syscall_64+0x149/0x1a0
[  440.686591]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  440.686596] RIP: 0033:0x7f212b976f6c
[  440.686604] Code: Bad RIP value.
[  440.686608] RSP: 002b:00007f2115108d30 EFLAGS: 00000246 ORIG_RAX:
00000000000000ca
[  440.686614] RAX: fffffffffffffe00 RBX: 00007f211d838c48 RCX:
00007f212b976f6c
[  440.686618] RDX: 0000000000000000 RSI: 0000000000000080 RDI:
00007f211d838c70
[  440.686622] RBP: 0000000000000000 R08: 0000000000000000 R09:
00007f2115109700
[  440.686626] R10: 0000000000000000 R11: 0000000000000246 R12:
0000000000000010
[  440.686630] R13: 00007f211d838c20 R14: 0000000000000000 R15:
00007f211d838c70
[  440.686634] Modules linked in: fuse sha256_ssse3 sha256_generic cfg80211
rfkill dm_crypt nhpoly1305_sse2 nhpoly1305 chacha_x86_64 chacha_generic
adiantum poly1305_generic algif_skcipher af_alg ext4 crc16 mbcache jbd2
input_leds led_class joydev hid_generic usbhid hid crct10dif_pclmul
crc32_generic crc32_pclmul ghash_generic gf128mul gcm xts ctr dm_mod cbc amdgpu
ecb evdev gpu_sched ohci_pci i2c_algo_bit ttm snd_hda_codec_realtek
snd_hda_codec_generic snd_hda_codec_hdmi drm_kms_helper ehci_pci ohci_hcd
cfbfillrect syscopyarea snd_hda_intel cfbimgblt k10temp sysfillrect ehci_hcd
aesni_intel sysimgblt fb_sys_fops snd_hda_codec cfbcopyarea fb snd_hwdep
usbcore aes_x86_64 snd_hda_core fam15h_power hwmon i2c_piix4 usb_common font
glue_helper crypto_simd sr_mod snd_pcm cryptd fbdev cdrom button snd_timer drm
acpi_cpufreq snd alx drm_panel_orientation_quirks soundcore processor backlight
mdio lzo nfsd auth_rpcgss lockd grace zstd sunrpc sg zram zsmalloc
[  440.686714] CR2: 0000000000000038
[  440.686720] ---[ end trace 39cfe5e575b273f7 ]---
[  440.686847] RIP: 0010:amdgpu_ib_schedule+0x4b/0x520 [amdgpu]
[  440.686852] Code: 89 f5 49 89 ff 48 89 54 24 08 0f b6 87 38 04 00 00 48 85
c9 0f 84 5d 03 00 00 48 8b 91 b0 00 00 00 48 89 54 24 10 48 8b 51 10 <48> 8b 52
38 48 89 14 24 84 c0 0f 84 09 e2 17 00 48 83 7c 24 10 00
[  440.686857] RSP: 0018:ffffac368c2a7ad0 EFLAGS: 00010286
[  440.686862] RAX: 0000000000000001 RBX: ffff97d66533dc00 RCX:
ffff97d66533dc00
[  440.686866] RDX: 0000000000000000 RSI: 0000000000000001 RDI:
ffff97d685fe7d48
[  440.686869] RBP: 0000000000000001 R08: ffffac368c2a7b48 R09:
0000000000000001
[  440.686873] R10: 0000000000000000 R11: 0000000000000001 R12:
0000000000000007
[  440.686877] R13: 0000000000ffd000 R14: ffff97d685fe0000 R15:
ffff97d685fe7d48
[  440.686882] FS:  00007f2115109700(0000) GS:ffff97d6a6ac0000(0000)
knlGS:0000000000000000
[  440.686887] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  440.686890] CR2: 00007f212b976f42 CR3: 00000003e4236000 CR4:
00000000000406e0
[  440.686894] Fixing recursive fault but reboot is needed!

This happens every time when xmr-stak 2.10.5 (w. ROCm 2.5) tries to compile
shaders for this R9 290X. An ~/.AMD archive is generated but the compilation
process never finishes. When I close the shell with xmr-stak running (CTRL-C
xmr-stack does not work), I get this kernel BUG. I used a 5.2-rc6 debug kernel,
but it happens on 5.1.15 too.

Card is a Sapphire Radeon R9 290X Tri-X OC (11226-18-20G), additional info
about the the system:

Machine:   Type: Desktop Mobo: ASUSTeK model: M5A78L-M LX3 v: Rev X.0x serial:
<root required> BIOS: American Megatrends 
           v: 1401 date: 05/05/2016 
CPU:       6-Core: AMD FX-6300 type: MCP speed: 3817 MHz min/max: 1400/3800 MHz 
Graphics:  Device-1: Advanced Micro Devices [AMD/ATI] Hawaii XT / Grenada XT
[Radeon R9 290X/390X] driver: amdgpu v: kernel 
           Display: x11 server: X.Org 1.20.4 driver: amdgpu,ati unloaded:
modesetting,radeon resolution: 1920x1080~60Hz 
           OpenGL: renderer: AMD Radeon R9 200 Series (HAWAII DRM 3.30.0
5.1.15-gentoo LLVM 8.0.0) v: 4.5 Mesa 19.0.8</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are the assignee for the bug.</li>
      </ul>
    </body>
</html>