Losing completion interrupts with amdgpu on rx460

Christian König christian.koenig at amd.com
Tue Dec 27 20:51:37 UTC 2016


It's a well known problem that the completion interrupts are notorious 
unreliable.

That's why we have a fallback timer in amdgpu_fence.c which kicks an 
extra hardware probe after a certain timeout. Please double check that 
this one is working as expected.

Another possibility is that the memory where the fence is written 
doesn't has the proper attributes (e.g. USWC vs. cached vs. uncached).

Regards,
Christian.

Am 26.12.2016 um 02:54 schrieb Matthew Macy:
> I'm running an rx460 using the amdgpu driver from Linux 4.8 with Mesa 13/LLVM 3.9 and Xorg 1.18 on FreeBSD. It seems to largely perform pretty well.
>
> However, ever since I got Mesa working I will inevitably end up losing completion interrupts after X has been running for a brief period. I can bring the problem on more quickly by running glxgears with vblank_mode=0. It's a safe bet that the problem with the linuxkpi. However, since this bug is manifesting itself in a very hardware specific way I'm coming here for advice on what I can do to dump device state to better understand why it ceases to fire interrupts.
>
> I enabled FENCE_TRACE and added some logging to fence creation and fence_default_wait as well. The last interrupt in this particular excerpt is:
>
> "Dec 22 22:36:22 daleks kernel: fence at 210850477167 f 0#233745: signaled from irq context"
>
> amdgpu_cs_wait goes on to sleep on 411#116530 and never wake up. Any guidance would be much appreciated. Thanks in advance.
>
>
>
>
> Dec 22 22:36:22 daleks kernel: [drm:drm_ioctl] pid=100793, dev=0xe200, auth=1, AMDGPU_BO_LIST
> Dec 22 22:36:22 daleks kernel: [drm:drm_ioctl]
> Dec 22 22:36:22 daleks kernel: &fence->scheduled at 210850212762 f 86#116944: signaled from irq context
> Dec 22 22:36:22 daleks kernel: pid=100699, dev=0xe200, auth=1, AMDGPU_BO_LIST
> Dec 22 22:36:22 daleks kernel: [drm:amdgpu_ih_process] [drm:drm_ioctl] pid=100793, dev=0xe200, auth=1, AMDGPU_CS
> Dec 22 22:36:22 daleks kernel: amdgpu_ih_process: rptr 864, wptr 880
> Dec 22 22:36:22 daleks kernel: [drm:gfx_v8_0_eop_irq] IH: CP EOP
> Dec 22 22:36:22 daleks kernel: &fence->finished at 210850251259 f 411#116528: signaled from irq context
> Dec 22 22:36:22 daleks kernel: fence at 210850253222 f 0#233742: signaled from irq context
> Dec 22 22:36:22 daleks kernel: [drm:amdgpu_ih_process] amdgpu_ih_process: rptr 880, wptr 880
> Dec 22 22:36:22 daleks kernel: [drm:amdgpu_ih_process] created fence 410#116529 411#116529 @210850271550
> Dec 22 22:36:22 daleks kernel: amdgpu_ih_process: rptr 880, wptr 896
> Dec 22 22:36:22 daleks kernel: [drm:drm_ioctl] [drm:gfx_v8_0_eop_irq] pid=100793, dev=0xe200, auth=1, AMDGPU_BO_LIST
> Dec 22 22:36:22 daleks kernel: IH: CP EOP
> Dec 22 22:36:22 daleks kernel: &fence->finished at 210850308909 f 87#116944: signaled from irq context
> Dec 22 22:36:22 daleks kernel: fence at 210850310670 f 0#233743: signaled from irq context
> Dec 22 22:36:22 daleks kernel: [drm:amdgpu_ih_process] amdgpu_ih_process: rptr 896, wptr 896
> Dec 22 22:36:22 daleks kernel: &fence->scheduled at 210850325151 f 410#116529: signaled from irq context
> Dec 22 22:36:22 daleks kernel: [drm:drm_ioctl] pid=100699, dev=0xe200, auth=1, AMDGPU_BO_LIST
> Dec 22 22:36:22 daleks kernel: [drm:drm_ioctl] pid=100699, dev=0xe200, auth=1, AMDGPU_CS
> Dec 22 22:36:22 daleks kernel: created fence 86#116945 87#116945 @210850375328
> Dec 22 22:36:22 daleks kernel: [drm:drm_ioctl] pid=100793, dev=0xe200, auth=1, AMDGPU_BO_LIST
> Dec 22 22:36:22 daleks kernel: [drm:drm_ioctl]
> Dec 22 22:36:22 daleks kernel: &fence->scheduled at 210850389385 f 86#116945: signaled from irq context
> Dec 22 22:36:22 daleks kernel: [drm:amdgpu_ih_process] pid=100699, dev=0xe200, auth=1, AMDGPU_BO_LIST
> Dec 22 22:36:22 daleks kernel: [drm:drm_ioctl] amdgpu_ih_process: rptr 896, wptr 912
> Dec 22 22:36:22 daleks kernel: [drm:gfx_v8_0_eop_irq] IH: CP EOP
> Dec 22 22:36:22 daleks kernel: &fence->finished at 210850416620 f 411#116529: signaled from irq context
> Dec 22 22:36:22 daleks kernel: fence at 210850418382 f 0#233744: signaled from irq context
> Dec 22 22:36:22 daleks kernel: pid=100793, dev=0xe200, auth=1, AMDGPU_CS
> Dec 22 22:36:22 daleks kernel: [drm:amdgpu_ih_process] created fence 410#116530 411#116530 @210850440720
> Dec 22 22:36:22 daleks kernel: amdgpu_ih_process: rptr 912, wptr 912
> Dec 22 22:36:22 daleks kernel: [drm:drm_ioctl] [drm:amdgpu_ih_process] amdgpu_ih_process: rptr 912, wptr 928
> Dec 22 22:36:22 daleks kernel: [drm:gfx_v8_0_eop_irq] IH: CP EOP
> Dec 22 22:36:22 daleks kernel: &fence->finished at 210850475397 f 87#116945: signaled from irq context
> Dec 22 22:36:22 daleks kernel: fence at 210850477167 f 0#233745: signaled from irq context
> Dec 22 22:36:22 daleks kernel: pid=100793, dev=0xe200, auth=1, AMDGPU_BO_LIST
> Dec 22 22:36:22 daleks kernel: [drm:amdgpu_ih_process] amdgpu_ih_process: rptr 928, wptr 928
> Dec 22 22:36:22 daleks kernel: [drm:drm_ioctl] pid=100699, dev=0xe200, auth=1, AMDGPU_BO_LIST
> Dec 22 22:36:22 daleks kernel: [drm:drm_ioctl] pid=100699, dev=0xe200, auth=1, AMDGPU_CS
> Dec 22 22:36:22 daleks kernel: created fence 86#116946 87#116946 @210850557790
> Dec 22 22:36:22 daleks kernel: [drm:drm_ioctl] pid=100793, dev=0xe200, auth=1, AMDGPU_BO_LIST
> Dec 22 22:36:22 daleks kernel: [drm:drm_ioctl] pid=100699, dev=0xe200, auth=1, AMDGPU_BO_LIST
> Dec 22 22:36:22 daleks kernel: [drm:drm_ioctl] pid=100793, dev=0xe200, auth=1, AMDGPU_CS
> Dec 22 22:36:22 daleks kernel: created fence 410#116531 411#116531 @210850614023
> Dec 22 22:36:22 daleks kernel: [drm:drm_ioctl] pid=100793, dev=0xe200, auth=1, AMDGPU_BO_LIST
> Dec 22 22:36:22 daleks kernel: [drm:drm_ioctl] pid=100699, dev=0xe200, auth=1, AMDGPU_BO_LIST
> Dec 22 22:36:22 daleks kernel: [drm:drm_ioctl] pid=100699, dev=0xe200, auth=1, AMDGPU_CS
> Dec 22 22:36:22 daleks kernel: created fence 86#116947 87#116947 @210850719230
> Dec 22 22:36:22 daleks kernel: [drm:drm_ioctl] pid=100793, dev=0xe200, auth=1, AMDGPU_WAIT_CS
> Dec 22 22:36:22 daleks kernel: [drm:drm_ioctl] amdgpu_cs_wait on 411#116530
> Dec 22 22:36:22 daleks kernel: pid=100699, dev=0xe200, auth=1, AMDGPU_BO_LIST
> Dec 22 22:36:22 daleks kernel: 411#116530 sleeping tid 100793 at 210850747487
>
>
> -M
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx




More information about the amd-gfx mailing list