Regression with kernel 4.20 on armhf

Alex Deucher alexdeucher at gmail.com
Thu Jan 3 14:12:14 UTC 2019


Does this patch help by any chance?

https://cgit.freedesktop.org/~agd5f/linux/commit/?h=amd-staging-drm-next&id=5e01c09ce3b7263d88873105f21a82eda904664b

Alex

On Thu, Jan 3, 2019 at 7:14 AM Luís Mendes <luis.p.mendes at gmail.com> wrote:
>
> Hi Christian, Alex,
>
> I've set the kernel command line with drm.debug=0xf, and I see what
> could be a race condition that triggers the failure, and from what I
> see the critical path is quite after the ring tests. This happens on
> ARM but maybe what is also affecting my TYAN S7002 and S7025, as the
> failure symptom seems similar, except it is failing every time on the
> TYANs.  While on an AsRock Rack EP2C602 with Xeon E5 v2 it is working
> fine.
>
> Below follow the two log excerpts, the first from a working
> initialization attempt, and the second from a failed initialization
> attempt. Both attemps were made with with kernel vanilla 4.20.0 on the
> same armhf system. Full dmesg logs attached. Please ignored the EDID
> errors, as I'm having a problem with this particular CROWN TV. The
> EDID gets overwritten at every boot when connected to any Radeon RX
> card that I have tried, while with Radeon R7 240 the EDID is not
> corrupted on boot, but that's another story.
>
> Meanwhile I will try to find the concrete racing condition. It is
> noticeable that for some reason the kernel thread
> [drm:amdgpu_ih_process [amdgpu]] doesn't receive updates due to the
> gpu hang and only one EOP irq is recevied on the bad boot attempt,
> while on the good attempt 3 EOP irqs are triggered.
>
> Good attempt (critical log excerpt from kern_good.log):
> Jan  3 11:28:03 picolo kernel: [   39.845747] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 16032, wptr 16048
> Jan  3 11:28:03 picolo kernel: [   39.845987]
> [drm:drm_calc_vbltimestamp_from_scanoutpos [drm]] crtc 0: Noisy
> timestamp 26 us > 20 us [3 reps].
> Jan  3 11:28:03 picolo kernel: [   39.850430] [drm:drm_ioctl [drm]]
> pid=627, dev=0xe200, auth=1, AMDGPU_BO_LIST
> Jan  3 11:28:03 picolo kernel: [   39.850489] [drm:drm_ioctl [drm]]
> pid=627, dev=0xe200, auth=1, AMDGPU_CS
> Jan  3 11:28:03 picolo kernel: [   39.850697] [drm:drm_ioctl [drm]]
> pid=627, dev=0xe200, auth=1, AMDGPU_BO_LIST
> Jan  3 11:28:03 picolo kernel: [   39.850943] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 16048, wptr 16080
> Jan  3 11:28:03 picolo kernel: [   39.850973] [drm:drm_ioctl [drm]]
> pid=627, dev=0xe200, auth=1, AMDGPU_BO_LIST
> Jan  3 11:28:03 picolo kernel: [   39.851133]
> [drm:sdma_v3_0_process_trap_irq [amdgpu]] IH: SDMA trap
> Jan  3 11:28:03 picolo kernel: [   39.851159] [drm:drm_ioctl [drm]]
> pid=627, dev=0xe200, auth=1, AMDGPU_CS
> Jan  3 11:28:03 picolo kernel: [   39.851333]
> [drm:sdma_v3_0_process_trap_irq [amdgpu]] IH: SDMA trap
> Jan  3 11:28:03 picolo kernel: [   39.851360] [drm:drm_ioctl [drm]]
> pid=627, dev=0xe200, auth=1, AMDGPU_BO_LIST
> Jan  3 11:28:03 picolo kernel: [   39.851513] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 16080, wptr 16096
> Jan  3 11:28:03 picolo kernel: [   39.851657]
> [drm:sdma_v3_0_process_trap_irq [amdgpu]] IH: SDMA trap
> Jan  3 11:28:03 picolo kernel: [   39.851810] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 16096, wptr 16096
> Jan  3 11:28:03 picolo kernel: [   39.851950] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 16096, wptr 16128
> Jan  3 11:28:03 picolo kernel: [   39.852091]
> [drm:sdma_v3_0_process_trap_irq [amdgpu]] IH: SDMA trap
> Jan  3 11:28:03 picolo kernel: [   39.852239]
> [drm:sdma_v3_0_process_trap_irq [amdgpu]] IH: SDMA trap
> Jan  3 11:28:03 picolo kernel: [   39.852265] [drm:drm_ioctl [drm]]
> pid=605, dev=0xe200, auth=1, AMDGPU_WAIT_CS
> Jan  3 11:28:03 picolo kernel: [   39.852411] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 16128, wptr 16128
> Jan  3 11:28:03 picolo kernel: [   39.852605] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 16128, wptr 16144
> Jan  3 11:28:03 picolo kernel: [   39.852754]
> [drm:sdma_v3_0_process_trap_irq [amdgpu]] IH: SDMA trap
> Jan  3 11:28:03 picolo kernel: [   39.852905] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 16144, wptr 16160
> Jan  3 11:28:03 picolo kernel: [   39.853049]
> [drm:sdma_v3_0_process_trap_irq [amdgpu]] IH: SDMA trap
> Jan  3 11:28:03 picolo kernel: [   39.853210] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 16160, wptr 16160
> Jan  3 11:28:03 picolo kernel: [   39.853418] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 16160, wptr 16176
> Jan  3 11:28:03 picolo kernel: [   39.853582] [drm:gfx_v8_0_eop_irq
> [amdgpu]] IH: CP EOP
> Jan  3 11:28:03 picolo kernel: [   39.853752] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 16176, wptr 16208
> Jan  3 11:28:03 picolo kernel: [   39.853901] [drm:gfx_v8_0_eop_irq
> [amdgpu]] IH: CP EOP
> Jan  3 11:28:03 picolo kernel: [   39.854044] [drm:gfx_v8_0_eop_irq
> [amdgpu]] IH: CP EOP
> Jan  3 11:28:03 picolo kernel: [   39.854205] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 16208, wptr 16208
> Jan  3 11:28:03 picolo kernel: [   39.857057] [drm:drm_ioctl [drm]]
> pid=605, dev=0xe200, auth=1, DRM_IOCTL_MODE_SETCRTC
> Jan  3 11:28:03 picolo kernel: [   39.857089] [drm:drm_mode_setcrtc
> [drm]] [CRTC:45:crtc-1]
> Jan  3 11:28:03 picolo kernel: [   39.857341]
> [drm:dm_plane_helper_prepare_fb [amdgpu]] No FB bound
> Jan  3 11:28:03 picolo kernel: [   39.857508]
> [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] amdgpu_crtc id:1
> crtc_state_flags: enable:0, active:0, planes_changed:0,
> mode_changed:0,active_changed:0,connectors_changed:0
> Jan  3 11:28:03 picolo kernel: [   39.857559] [drm:drm_ioctl [drm]]
> pid=605, dev=0xe200, auth=1, DRM_IOCTL_MODE_SETCRTC
> Jan  3 11:28:03 picolo kernel: [   39.857587] [drm:drm_mode_setcrtc
> [drm]] [CRTC:47:crtc-2]
> Jan  3 11:28:03 picolo kernel: [   39.857769]
> [drm:dm_plane_helper_prepare_fb [amdgpu]] No FB bound
> Jan  3 11:28:03 picolo kernel: [   39.857944]
> [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] amdgpu_crtc id:2
> crtc_state_flags: enable:0, active:0, planes_changed:0,
> mode_changed:0,active_changed:0,connectors_changed:0
> Jan  3 11:28:03 picolo kernel: [   39.857992] [drm:drm_ioctl [drm]]
> pid=605, dev=0xe200, auth=1, DRM_IOCTL_MODE_SETCRTC
> Jan  3 11:28:03 picolo kernel: [   39.858020] [drm:drm_mode_setcrtc
> [drm]] [CRTC:49:crtc-3]
>
> BAD attempt (critical log excerpt from kern_bad.log):
> Jan  3 11:39:23 picolo kernel: [   39.599313] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14240, wptr 14256
> Jan  3 11:39:23 picolo kernel: [   39.599496]
> [drm:drm_calc_vbltimestamp_from_scanoutpos [drm]] crtc 0: Noisy
> timestamp 26 us > 20 us [3 reps].
> Jan  3 11:39:23 picolo kernel: [   39.599599] [drm:drm_ioctl [drm]]
> pid=663, dev=0xe200, auth=1, AMDGPU_BO_LIST
> Jan  3 11:39:23 picolo kernel: [   39.599640] [drm:drm_ioctl [drm]]
> pid=663, dev=0xe200, auth=1, AMDGPU_CS
> Jan  3 11:39:23 picolo kernel: [   39.599992] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14256, wptr 14272
> Jan  3 11:39:23 picolo kernel: [   39.600142]
> [drm:sdma_v3_0_process_trap_irq [amdgpu]] IH: SDMA trap
> Jan  3 11:39:23 picolo kernel: [   39.600297] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14272, wptr 14304
> Jan  3 11:39:23 picolo kernel: [   39.600439]
> [drm:sdma_v3_0_process_trap_irq [amdgpu]] IH: SDMA trap
> Jan  3 11:39:23 picolo kernel: [   39.600580]
> [drm:sdma_v3_0_process_trap_irq [amdgpu]] IH: SDMA trap
> Jan  3 11:39:23 picolo kernel: [   39.600725] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14304, wptr 14304
> Jan  3 11:39:23 picolo kernel: [   39.600795] [drm:drm_ioctl [drm]]
> pid=663, dev=0xe200, auth=1, AMDGPU_BO_LIST
> Jan  3 11:39:23 picolo kernel: [   39.600846] [drm:drm_ioctl [drm]]
> pid=663, dev=0xe200, auth=1, AMDGPU_BO_LIST
> Jan  3 11:39:23 picolo kernel: [   39.600881] [drm:drm_ioctl [drm]]
> pid=663, dev=0xe200, auth=1, AMDGPU_CS
> Jan  3 11:39:23 picolo kernel: [   39.601019] [drm:drm_ioctl [drm]]
> pid=663, dev=0xe200, auth=1, AMDGPU_BO_LIST
> Jan  3 11:39:23 picolo kernel: [   39.601074] [drm:drm_ioctl [drm]]
> pid=630, dev=0xe200, auth=1, AMDGPU_WAIT_CS
> Jan  3 11:39:23 picolo kernel: [   39.601269] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14304, wptr 14320
> Jan  3 11:39:23 picolo kernel: [   39.601416]
> [drm:sdma_v3_0_process_trap_irq [amdgpu]] IH: SDMA trap
> Jan  3 11:39:23 picolo kernel: [   39.601569] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14320, wptr 14384
> Jan  3 11:39:23 picolo kernel: [   39.601595] [drm:drm_ioctl [drm]]
> pid=630, dev=0xe200, auth=1, AMDGPU_WAIT_CS
> Jan  3 11:39:23 picolo kernel: [   39.601738]
> [drm:sdma_v3_0_process_trap_irq [amdgpu]] IH: SDMA trap
> Jan  3 11:39:23 picolo kernel: [   39.601880]
> [drm:sdma_v3_0_process_trap_irq [amdgpu]] IH: SDMA trap
> Jan  3 11:39:23 picolo kernel: [   39.602029]
> [drm:sdma_v3_0_process_trap_irq [amdgpu]] IH: SDMA trap
> Jan  3 11:39:23 picolo kernel: [   39.602171]
> [drm:sdma_v3_0_process_trap_irq [amdgpu]] IH: SDMA trap
> Jan  3 11:39:23 picolo kernel: [   39.602313] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14384, wptr 14384
> Jan  3 11:39:23 picolo kernel: [   39.602500] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14384, wptr 14400
> Jan  3 11:39:23 picolo kernel: [   39.602649]
> [drm:sdma_v3_0_process_trap_irq [amdgpu]] IH: SDMA trap
> Jan  3 11:39:23 picolo kernel: [   39.602887] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14400, wptr 14416
> Jan  3 11:39:23 picolo kernel: [   39.603054] [drm:gfx_v8_0_eop_irq
> [amdgpu]] IH: CP EOP
> Jan  3 11:39:23 picolo kernel: [   39.615864] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14416, wptr 14432
> Jan  3 11:39:23 picolo kernel: [   39.632542] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14432, wptr 14448
> Jan  3 11:39:23 picolo kernel: [   39.649264] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14448, wptr 14464
> Jan  3 11:39:23 picolo kernel: [   39.665943] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14464, wptr 14480
> Jan  3 11:39:23 picolo kernel: [   39.682610] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14480, wptr 14496
> Jan  3 11:39:23 picolo kernel: [   39.699285] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14496, wptr 14512
> Jan  3 11:39:23 picolo kernel: [   39.715955] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14512, wptr 14528
> Jan  3 11:39:23 picolo kernel: [   39.732629] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14528, wptr 14544
> Jan  3 11:39:23 picolo kernel: [   39.749313] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14544, wptr 14560
> Jan  3 11:39:23 picolo kernel: [   39.765995] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14560, wptr 14576
> Jan  3 11:39:23 picolo kernel: [   39.782667] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14576, wptr 14592
> Jan  3 11:39:23 picolo kernel: [   39.799363] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14592, wptr 14608
> Jan  3 11:39:23 picolo kernel: [   39.816043] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14608, wptr 14624
> Jan  3 11:39:23 picolo kernel: [   39.832734] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14624, wptr 14640
> Jan  3 11:39:23 picolo kernel: [   39.849426] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14640, wptr 14656
> Jan  3 11:39:23 picolo kernel: [   39.866081] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14656, wptr 14672
> Jan  3 11:39:23 picolo kernel: [   39.882822] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14672, wptr 14688
> Jan  3 11:39:23 picolo kernel: [   39.899455] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14688, wptr 14704
> Jan  3 11:39:23 picolo kernel: [   39.916190] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14704, wptr 14720
> Jan  3 11:39:23 picolo kernel: [   39.932885] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14720, wptr 14736
> Jan  3 11:39:23 picolo kernel: [   39.949589] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14736, wptr 14752
> Jan  3 11:39:23 picolo kernel: [   39.966238] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14752, wptr 14768
> Jan  3 11:39:23 picolo kernel: [   39.982869] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14768, wptr 14784
> Jan  3 11:39:23 picolo kernel: [   39.999609] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14784, wptr 14800
> Jan  3 11:39:23 picolo kernel: [   40.016286] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14800, wptr 14816
> Jan  3 11:39:23 picolo kernel: [   40.033045] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14816, wptr 14832
> Jan  3 11:39:23 picolo kernel: [   40.049716] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14832, wptr 14848
> Jan  3 11:39:23 picolo kernel: [   40.066446] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14848, wptr 14864
> Jan  3 11:39:23 picolo kernel: [   40.083031] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14864, wptr 14880
> Jan  3 11:39:23 picolo kernel: [   40.099765] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14880, wptr 14896
> Jan  3 11:39:23 picolo kernel: [   40.116394] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14896, wptr 14912
> Jan  3 11:39:23 picolo kernel: [   40.133133] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14912, wptr 14928
> Jan  3 11:39:23 picolo kernel: [   40.149743] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14928, wptr 14944
> Jan  3 11:39:23 picolo kernel: [   40.166426] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14944, wptr 14960
> Jan  3 11:39:23 picolo kernel: [   40.183178] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14960, wptr 14976
> Jan  3 11:39:23 picolo kernel: [   40.199788] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14976, wptr 14992
> Jan  3 11:39:23 picolo kernel: [   40.216507] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 14992, wptr 15008
> Jan  3 11:39:23 picolo kernel: [   40.233150] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15008, wptr 15024
> Jan  3 11:39:23 picolo kernel: [   40.249815] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15024, wptr 15040
> Jan  3 11:39:23 picolo kernel: [   40.266454] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15040, wptr 15056
> Jan  3 11:39:23 picolo kernel: [   40.283123] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15056, wptr 15072
> Jan  3 11:39:23 picolo kernel: [   40.299804] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15072, wptr 15088
> Jan  3 11:39:23 picolo kernel: [   40.316483] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15088, wptr 15104
> Jan  3 11:39:23 picolo kernel: [   40.333164] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15104, wptr 15120
> Jan  3 11:39:23 picolo kernel: [   40.349843] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15120, wptr 15136
> Jan  3 11:39:23 picolo kernel: [   40.366523] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15136, wptr 15152
> Jan  3 11:39:23 picolo kernel: [   40.383200] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15152, wptr 15168
> Jan  3 11:39:23 picolo kernel: [   40.399878] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15168, wptr 15184
> Jan  3 11:39:23 picolo kernel: [   40.416561] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15184, wptr 15200
> Jan  3 11:39:23 picolo kernel: [   40.433245] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15200, wptr 15216
> Jan  3 11:39:23 picolo kernel: [   40.449925] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15216, wptr 15232
> Jan  3 11:39:23 picolo kernel: [   40.466613] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15232, wptr 15248
> Jan  3 11:39:24 picolo kernel: [   40.483291] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15248, wptr 15264
> Jan  3 11:39:24 picolo kernel: [   40.499971] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15264, wptr 15280
> Jan  3 11:39:24 picolo kernel: [   40.516652] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15280, wptr 15296
> Jan  3 11:39:24 picolo kernel: [   40.533336] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15296, wptr 15312
> Jan  3 11:39:24 picolo kernel: [   40.550016] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15312, wptr 15328
> Jan  3 11:39:24 picolo kernel: [   40.566715] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15328, wptr 15344
> Jan  3 11:39:24 picolo kernel: [   40.583390] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15344, wptr 15360
> Jan  3 11:39:24 picolo kernel: [   40.600065] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15360, wptr 15376
> Jan  3 11:39:24 picolo kernel: [   40.616745] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15376, wptr 15392
> Jan  3 11:39:24 picolo kernel: [   40.633432] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15392, wptr 15408
> Jan  3 11:39:24 picolo kernel: [   40.650113] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15408, wptr 15424
> Jan  3 11:39:24 picolo kernel: [   40.666790] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15424, wptr 15440
> Jan  3 11:39:24 picolo kernel: [   40.683477] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15440, wptr 15456
> Jan  3 11:39:24 picolo kernel: [   40.700157] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15456, wptr 15472
> Jan  3 11:39:24 picolo kernel: [   40.716836] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15472, wptr 15488
> Jan  3 11:39:24 picolo kernel: [   40.733522] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15488, wptr 15504
> Jan  3 11:39:24 picolo kernel: [   40.750203] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15504, wptr 15520
> Jan  3 11:39:24 picolo kernel: [   40.766882] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15520, wptr 15536
> Jan  3 11:39:24 picolo kernel: [   40.783563] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15536, wptr 15552
> Jan  3 11:39:24 picolo kernel: [   40.800247] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15552, wptr 15568
> Jan  3 11:39:24 picolo kernel: [   40.816929] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15568, wptr 15584
> Jan  3 11:39:24 picolo kernel: [   40.833633] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15584, wptr 15600
> Jan  3 11:39:24 picolo kernel: [   40.850305] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15600, wptr 15616
> Jan  3 11:39:24 picolo kernel: [   40.867011] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15616, wptr 15632
> Jan  3 11:39:24 picolo kernel: [   40.883676] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15632, wptr 15648
> Jan  3 11:39:24 picolo kernel: [   40.900346] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15648, wptr 15664
> Jan  3 11:39:24 picolo kernel: [   40.917026] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15664, wptr 15680
> Jan  3 11:39:24 picolo kernel: [   40.933716] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15680, wptr 15696
> Jan  3 11:39:24 picolo kernel: [   40.950390] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15696, wptr 15712
> Jan  3 11:39:24 picolo kernel: [   40.967070] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15712, wptr 15728
> Jan  3 11:39:24 picolo kernel: [   40.983757] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15728, wptr 15744
> Jan  3 11:39:24 picolo kernel: [   41.000438] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15744, wptr 15760
> Jan  3 11:39:24 picolo kernel: [   41.017115] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15760, wptr 15776
> Jan  3 11:39:24 picolo kernel: [   41.033812] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15776, wptr 15792
> Jan  3 11:39:24 picolo kernel: [   41.050485] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15792, wptr 15808
> Jan  3 11:39:24 picolo kernel: [   41.067162] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15808, wptr 15824
> Jan  3 11:39:24 picolo kernel: [   41.083845] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15824, wptr 15840
> Jan  3 11:39:24 picolo kernel: [   41.100523] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15840, wptr 15856
> Jan  3 11:39:24 picolo kernel: [   41.117205] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15856, wptr 15872
> Jan  3 11:39:24 picolo kernel: [   41.133904] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15872, wptr 15888
> Jan  3 11:39:24 picolo kernel: [   41.150579] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15888, wptr 15904
> Jan  3 11:39:24 picolo kernel: [   41.167255] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15904, wptr 15920
> Jan  3 11:39:24 picolo kernel: [   41.183933] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15920, wptr 15936
> Jan  3 11:39:24 picolo kernel: [   41.200614] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15936, wptr 15952
> Jan  3 11:39:24 picolo kernel: [   41.217295] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15952, wptr 15968
> Jan  3 11:39:24 picolo kernel: [   41.233984] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15968, wptr 15984
> Jan  3 11:39:24 picolo kernel: [   41.250663] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 15984, wptr 16000
> Jan  3 11:39:24 picolo kernel: [   41.267347] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 16000, wptr 16016
> Jan  3 11:39:24 picolo kernel: [   41.284027] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 16016, wptr 16032
> Jan  3 11:39:24 picolo kernel: [   41.300706] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 16032, wptr 16048
> Jan  3 11:39:24 picolo kernel: [   41.317388] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 16048, wptr 16064
> Jan  3 11:39:24 picolo kernel: [   41.334071] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 16064, wptr 16080
> Jan  3 11:39:24 picolo kernel: [   41.350752] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 16080, wptr 16096
> Jan  3 11:39:24 picolo kernel: [   41.367442] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 16096, wptr 16112
> Jan  3 11:39:24 picolo kernel: [   41.384122] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 16112, wptr 16128
> Jan  3 11:39:24 picolo kernel: [   41.400801] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 16128, wptr 16144
> Jan  3 11:39:24 picolo kernel: [   41.417480] [drm:amdgpu_ih_process
> [amdgpu]] amdgpu_ih_process: rptr 16144, wptr 16160
> Jan  3 11:39:24 picolo kernel: [   41.432501] [drm:vblank_disable_fn
> [drm]] disabling vblank on crtc 0
> Jan  3 11:41:22 picolo kernel: [   49.762715] [drm:amdgpu_job_timedout
> [amdgpu]] *ERROR* ring gfx timeout, signaled seq=2, emitted seq=3
> Jan  3 11:41:22 picolo kernel: [   49.772047] [drm] GPU recovery disabled.
>
> Regards,
> Luís
>
> On Wed, Jan 2, 2019 at 12:05 PM Christian König
> <ckoenig.leichtzumerken at gmail.com> wrote:
> >
> > Hi Luis,
> >
> > mhm, sounds like a timing issue. We have probably made something faster
> > during bootup in 4.20 and because of this you now see this issue more often.
> >
> > If the bisection doesn't show any result can you try adding some
> > msleep(10) call at critical places in the driver code to narrow this down?
> >
> > Officially we don't test/support ARM with the driver code, but in this
> > particular case we should probably investigate since it sounds like it
> > just doesn't happen on x86 because of different timing.
> >
> > Thanks,
> > Christian.
> >
> > Am 28.12.18 um 15:05 schrieb Luís Mendes:
> > > Hi Alex,
> > >
> > > Before all... Have a nice holidays! Happy new year!!
> > >
> > > - Okay, so it looks like sometimes the driver is able to enter
> > > graphical mode with the Polaris card, but most of the time it fails
> > > before with:
> > > [   49.762704] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx
> > > timeout, signaled seq=2, emitted seq=3
> > >
> > > - This is something that is happening sporadically but in a less
> > > intensive way in 4.17, 4.18 and 4.19 kernels, so this is actually not
> > > a regression, but rather an existent issue, which maybe the patch
> > > "drm/amdgpu/gfx_v8_0: Reorder the gfx, kiq and kcq ring tests
> > > sequence" solves. I tried to backport it to 4.20, but had no
> > > improvement. Need to try with the git version, or rc1.
> > >
> > > - This hang happens after the console is displayed in the screen, but
> > > before switching to graphical mode with X.
> > >
> > > - However if X is entered then the driver is stable and can be used
> > > for long periods.
> > >
> > > Regards,
> > > Luís Mendes
> > >
> > > On Tue, Dec 18, 2018 at 11:16 PM Luís Mendes <luis.p.mendes at gmail.com> wrote:
> > >> Hi Alex,
> > >>
> > >> I am already using drm_arch_can_wc_memory() set to false.
> > >> I will try to bisect...
> > >>
> > >> Regards,
> > >> Luís
> > >>
> > >> On Tue, Dec 18, 2018 at 7:03 PM Alex Deucher <alexdeucher at gmail.com> wrote:
> > >>> On Tue, Dec 18, 2018 at 8:58 AM Luís Mendes <luis.p.mendes at gmail.com> wrote:
> > >>>> Hi Christian,
> > >>>>
> > >>>> I've been using a Sapphire RX 550 and a Sapphire RX 460 on a custom
> > >>>> armhf board that runs well with Linux 4.19.9 at least, but now
> > >>>> starting with Linux kernel 4.20, I'm having a gpu hang, right after
> > >>>> the console being displayed, but before entering in graphical mode,
> > >>>> when starting X session.
> > >>>> I'm only reporting this now, because there was a PCI commit for mvebu
> > >>>> that also entered for linux-4.20 that caused a kernel oops during
> > >>>> pci_map_rom call in amdgpu initialization code. I've reverted that
> > >>>> patch, but now amdgpu is hanging.
> > >>> It would be useful if you could bisect.  This is the first I've heard
> > >>> of amdgpu working on an ARM board without write combining (WC)
> > >>> disabled.  You might check to see if disabling WC helps.  Return false
> > >>> in drm_arch_can_wc_memory().
> > >>>
> > >>> Alex
> > >>>
> > >>>>
> > >>>> [   24.801861] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx
> > >>>> timeout, signaled seq=2, emitted seq=3
> > >>>>
> > >>>> 02:00.0 VGA compatible controller: Advanced Micro Devices, Inc.
> > >>>> [AMD/ATI] Baffin [Polaris11] (rev ff) (prog-if 00 [VGA controller])
> > >>>>      Subsystem: Sapphire Technology Limited Baffin [Radeon RX 560]
> > >>>>      Flags: bus master, fast devsel, latency 0, IRQ 51
> > >>>>      Memory at d0000000 (64-bit, prefetchable) [size=256M]
> > >>>>      Memory at e0000000 (64-bit, prefetchable) [size=2M]
> > >>>>      I/O ports at 10000 [size=256]
> > >>>>      Memory at e0200000 (32-bit, non-prefetchable) [size=256K]
> > >>>>      Expansion ROM at e0240000 [disabled] [size=128K]
> > >>>>      Capabilities: <access denied>
> > >>>>      Kernel driver in use: amdgpu
> > >>>>      Kernel modules: amdgpu
> > >>>>
> > >>>> dmesg follows in attachment.
> > >>>>
> > >>>> Regards,
> > >>>> Luís
> > >>>> _______________________________________________
> > >>>> amd-gfx mailing list
> > >>>> amd-gfx at lists.freedesktop.org
> > >>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> > > _______________________________________________
> > > amd-gfx mailing list
> > > amd-gfx at lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> >


More information about the amd-gfx mailing list