imxdrm issue on SABRE Lite
Thierry Reding
thierry.reding at gmail.com
Mon Feb 13 08:05:33 UTC 2017
On Sun, Feb 12, 2017 at 12:15:46AM +0000, Russell King - ARM Linux wrote:
> On Sat, Feb 11, 2017 at 09:09:34PM +0000, Dan MacDonald wrote:
> > [ 43.997066] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR*
> > [CRTC:24:crtc-0] flip_done timed out
> > [ 55.517063] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR*
> > [CRTC:24:crtc-0] flip_done timed out
>
> This seems to lay the foundation for the kernel to Oops itself later.
> The problem seems to be this:
>
> drm_atomic_helper_commit(state->dev, state, false)
> - drm_atomic_helper_setup_commit(state, false)
> - foreach crtc in state
> - commit->event = kzalloc()
> - crtc_state->event = commit->event
> - crtc_state->event->base.completion = &commit->flip_done
> ...
> - commit_tail(state)
> - funcs->atomic_commit_tail(state)
> ...
> - drm_atomic_helper_commit_planes(dev, state,
> DRM_PLANE_COMMIT_ACTIVE_ONLY |
> DRM_PLANE_COMMIT_NO_DISABLE_AFTER_MODES$ - foreach active crtc in state
> - funcs->atomic_begin(crtc, old_crtc_state)
> - ipu_crtc_atomic_begin()
> - drm_crtc_vblank_on()
> - if crtc->state->event
> - drm_crtc_arm_vblank_event(crtc, crtc->state->event)
> - crtc->state->event = NULL
>
> At this point, the "commit->flip_done" completion is queued with the
> event onto the vblank list.
> ...
> - drm_atomic_helper_commit_cleanup_done(state)
> - foreach crtc in state
> - try_wait_for_completion(&commit->hw_done)
> - wait_for_completion_timeout(&commit->flip_done, 10sec)
>
> This is where we get the timeout message.
>
> - drm_atomic_state_free(state)
>
> This "clears" the commit state (calling drm_crtc_commit_put() on it)
> which has the effect of kfree()'ing the structure containing the
> flip_done, but which is still on the vblank list.
>
> The next time we try to set a mode, the result is that a call to
> drm_crtc_vblank_off() causes all queued events to be sent, including
> the now kfree()'d flip_done completion, resulting in the reported
> kernel oops.
>
> It seems others are also suffering similar issues when the flip_done
> completion times out with other drivers:
>
> https://lkml.org/lkml/2016/12/1/171
> https://bugs.freedesktop.org/show_bug.cgi?id=96781
> https://lists.opensuse.org/opensuse-bugs/2016-10/msg03011.html
> https://patchwork.kernel.org/patch/9280223/ (which is me...)
>
> This is likely the same, although the timeout line was not captured:
> https://bugzilla.redhat.com/show_bug.cgi?id=1415180
> https://bodhi.fedoraproject.org/updates/kernel-4.8.7-200.fc24
>
> So, can we please avoid killing the kernel when the hardware doesn't
> quite behave as we want it to?
>
> Right now, when we oops the kernel, we're leaking all the memory
> associated with the atomic modeset, so if we stop oopsing the kernel
> but still leak the memory, surely that would be an improvement?
> Maybe something like the untested patch at the bottom of this mail?
>
> It would give the opportunity to poke about on a failed system to
> work out what happened and maybe why the hardware misbehaved.
>
> The real answer is for the hardware to behave, but we can't always
> have our cake.
>
> Note - I can't reproduce Dan's problem here on 4.10-rc7 as I suspect
> it needs multiple CRTCs/outputs running in the IPU to trigger it,
> which Sabre lite has. I'll try enabling the (disconnected) LVDS
> output tomorrow (I have Fabio's LVDS patch knocking about), but I
> suspect those with a deeper knowledge of the IPU need to investigate
> what's going on.
>
> > [ 214.765689] imx-ipuv3 2400000.ipu: DC stop timeout after 50 ms
> > [ 214.825688] Unable to handle kernel NULL pointer dereference at
> > virtual address 00000000
> > [ 214.833783] pgd = ed1b8000
> > [ 214.836491] [00000000] *pgd=4c974831
> > [ 214.840084] Internal error: Oops: 17 [#1] SMP ARM
> > [ 214.844789] Modules linked in: mousedev snd_soc_sgtl5000
> > snd_soc_fsl_ssi snd_soc_imx_sgtl5000 imx_pcm_fiq imx_pcm_dma
> > snd_soc_fsl_asrc snd_soc_fsl_asoc_card snd_soc_core dw_hdmi_ahb_audio
> > snd_pcm_dmaengine caam_jr imx_ipuv3_crtc snd_ac97_codec coda
> > v4l2_mem2mem videobuf2_dma_contig ac97_bus imx_ipu_v3
> > snd_soc_imx_audmux snd_pcm videobuf2_vmalloc videobuf2_memops
> > videobuf2_v4l2 videobuf2_core dw_hdmi_imx caam imx2_wdt ofpart spi_imx
> > evdev dw_hdmi etnaviv imx_ldb pwm_imx snd_timer parallel_display
> > uio_pdrv_genirq uio imxdrm sch_fq_codel ip_tables x_tables
> > [ 214.894338] CPU: 2 PID: 316 Comm: Xorg Tainted: G W
> > 4.9.8-1-ARCH #1
> > [ 214.901735] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
> > [ 214.908264] task: ed2c4d00 task.stack: ed2a6000
> > [ 214.912803] PC is at __wake_up_common+0x1c/0x80
> > [ 214.917337] LR is at __wake_up_locked+0x14/0x1c
> > [ 214.921871] pc : [<c0186568>] lr : [<c018662c>] psr: a0070093
> > [ 214.921871] sp : ed2a7c68 ip : 00000000 fp : c0fa2a70
> > [ 214.933348] r10: c0f37384 r9 : 00000003 r8 : 00000000
> > [ 214.938574] r7 : 00000000 r6 : edbf3410 r5 : edbf3408 r4 : edbf340c
> > [ 214.945101] r3 : 00000000 r2 : 00000000 r1 : 00000000 r0 : edbf340c
> > [ 214.951630] Flags: NzCv IRQs off FIQs on Mode SVC_32 ISA ARM
> > Segment none
> > [ 214.958854] Control: 10c5387d Table: 3d1b804a DAC: 00000051
> > [ 214.964600] Process Xorg (pid: 316, stack limit = 0xed2a6220)
> > [ 214.970347] Stack: (0xed2a7c68 to 0xed2a8000)
> > [ 214.974708] 7c60: 00000000 edbf340c edbf3408
> > a0070093 edb46114 edb46000
> > [ 214.982889] 7c80: 0000039f c0f37384 c0fa2a70 c018662c 00000000
> > c0fa1144 bf21c4fc c018704c
> > [ 214.991069] 7ca0: edbf3380 edbf3380 00000000 c06ccac0 edfa1400
> > edbf3380 00000000 c06d023c
> > [ 214.999250] 7cc0: 0000039f 00000000 edab2a00 80070013 edb4611c
> > c12040cc 00000000 00000000
> > [ 215.007430] 7ce0: 00000000 00040908 edb3f410 ed412c40 00000000
> > 00000000 ed26a800 edb47c18
> > [ 215.015610] 7d00: c0f37384 c0fa1144 bf21c4fc c06c3e3c edc67780
> > c06c1c24 edb46000 edb46000
> > [ 215.023791] 7d20: bf01862c ed412c40 edb46000 00000000 edb46000
> > ed410100 ed412240 ed412240
> > [ 215.031971] 7d40: c0c0c0c0 bf0182bc ed412c40 bf018ca4 00000000
> > c06c40f4 ed412c40 00000000
> > [ 215.040152] 7d60: 00000000 c06c41ac ed412c40 00000000 ed2a7dd0
> > edb47c18 ed410100 c06c46dc
> > [ 215.048332] 7d80: eda36180 edb47c18 00000001 c12040cc ed410100
> > c06d5f8c ed2a7e4c edb46000
> > [ 215.056513] 7da0: 00000001 c12040cc ed410100 c06d74c4 edaa5980
> > edb46248 edb47c18 eda3618c
> > [ 215.064694] 7dc0: eda36180 ed412240 ed410100 c12040cc eda36180
> > edb47c18 ed410100 00000000
> > [ 215.072874] 7de0: 00000000 ed412240 00000001 00040908 ed2a7e70
> > 00000051 00000068 c12040cc
> > [ 215.081055] 7e00: c0cbf0a0 00000068 edf66800 ed2a7e4c c06864a2
> > c06ce704 0000e201 00000001
> > [ 215.089236] 7e20: c0fa22f0 00040908 f50417fa ed2a7e4c be9d1910
> > 000000a2 edb46000 00000068
> > [ 215.097415] 7e40: 00000062 c06d7020 c1208504 008a1210 00000000
> > 00000001 00000018 00000047
> > [ 215.105596] 7e60: 00000000 00000000 00000000 00000001 00007530
> > 03480320 03a00378 01e00000
> > [ 215.113775] 7e80: 01f001ed 0000020d 00000000 00000000 00000000
> > 00000000 00000000 00000000
> > [ 215.121955] 7ea0: 00000000 00000000 00000000 00000000 00000000
> > c0b0a268 00000000 00040908
> > [ 215.130137] 7ec0: ed2a7edc ed2a7fb0 00000008 00040908 00000000
> > c12040cc be9d1910 edd8cbc8
> > [ 215.138317] 7ee0: edcc9b40 be9d1910 0000000b 00000000 00000000
> > c0288638 004f10a0 c0101308
> > [ 215.146497] 7f00: 00000000 eff6fe48 eff6fe24 eff6fe00 000112da
> > c02bb9bc 00000008 c01b4a08
> > [ 215.154677] 7f20: 00040908 c12040cc 00000000 ed29b1b8 00000003
> > be9d1910 be9d1808 ffffe000
> > [ 215.162857] 7f40: 00000051 ed2a6100 00000100 00000000 ed2a6000
> > c0102fa4 be9d1678 00040908
> > [ 215.171038] 7f60: c12040cc 00000000 edcc9b41 edcc9b40 c06864a2
> > be9d1910 0000000b 00000000
> > [ 215.179218] 7f80: 00000000 c0288f78 b6fc3c90 be9d1910 c06864a2
> > 00000036 c0107e24 ed2a6000
> > [ 215.187398] 7fa0: 00000000 c0107c60 b6fc3c90 be9d1910 0000000b
> > c06864a2 be9d1910 00000001
> > [ 215.195579] 7fc0: b6fc3c90 be9d1910 c06864a2 00000036 008a1210
> > 00000047 00000018 00000000
> > [ 215.203759] 7fe0: b6ddf088 be9d18f4 b6dc7ad4 b6b11adc 40070010
> > 0000000b 3fffd861 3fffdc61
> > [ 215.211946] [<c0186568>] (__wake_up_common) from [<c018662c>]
> > (__wake_up_locked+0x14/0x1c)
> > [ 215.220216] [<c018662c>] (__wake_up_locked) from [<c018704c>]
> > (complete_all+0x34/0x44)
> > [ 215.228141] [<c018704c>] (complete_all) from [<c06ccac0>]
> > (drm_send_event_locked+0x28/0x11c)
> > [ 215.236588] [<c06ccac0>] (drm_send_event_locked) from [<c06d023c>]
> > (drm_vblank_off+0x1b0/0x21c)
> > [ 215.245296] [<c06d023c>] (drm_vblank_off) from [<c06c3e3c>]
> > (drm_atomic_helper_commit_modeset_disables+0x1dc/0x3fc)
> > [ 215.255750] [<c06c3e3c>]
> > (drm_atomic_helper_commit_modeset_disables) from [<bf0182bc>]
> > (imx_drm_atomic_commit_tail+0x18/0x58 [imxdrm])
> > [ 215.267843] [<bf0182bc>] (imx_drm_atomic_commit_tail [imxdrm]) from
> > [<c06c40f4>] (commit_tail+0x40/0x5c)
> > [ 215.277327] [<c06c40f4>] (commit_tail) from [<c06c41ac>]
> > (drm_atomic_helper_commit+0x94/0xd8)
> > [ 215.285858] [<c06c41ac>] (drm_atomic_helper_commit) from
> > [<c06c46dc>] (drm_atomic_helper_set_config+0x78/0x9c)
> > [ 215.295865] [<c06c46dc>] (drm_atomic_helper_set_config) from
> > [<c06d5f8c>] (drm_mode_set_config_internal+0x58/0xdc)
> > [ 215.306219] [<c06d5f8c>] (drm_mode_set_config_internal) from
> > [<c06d74c4>] (drm_mode_setcrtc+0x4a4/0x550)
> > [ 215.315703] [<c06d74c4>] (drm_mode_setcrtc) from [<c06ce704>]
> > (drm_ioctl+0x214/0x44c)
> > [ 215.323540] [<c06ce704>] (drm_ioctl) from [<c0288638>]
> > (do_vfs_ioctl+0xac/0x980)
> > [ 215.330940] [<c0288638>] (do_vfs_ioctl) from [<c0288f78>]
> > (SyS_ioctl+0x6c/0x7c)
> > [ 215.338258] [<c0288f78>] (SyS_ioctl) from [<c0107c60>]
> > (ret_fast_syscall+0x0/0x3c)
> > [ 215.345832] Code: e5b61004 e1a08003 e59d7028 e1560001 (e5913000)
> > [ 215.351929] ---[ end trace e8a77aa320be7e56 ]---
>
> drivers/gpu/drm/drm_atomic_helper.c | 15 +++++++++------
> include/drm/drm_atomic_helper.h | 2 +-
> 2 files changed, 10 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
> index 21f992605541..46668d071d6a 100644
> --- a/drivers/gpu/drm/drm_atomic_helper.c
> +++ b/drivers/gpu/drm/drm_atomic_helper.c
> @@ -1201,9 +1201,8 @@ static void commit_tail(struct drm_atomic_state *state)
> else
> drm_atomic_helper_commit_tail(state);
>
> - drm_atomic_helper_commit_cleanup_done(state);
> -
> - drm_atomic_state_free(state);
> + if (drm_atomic_helper_commit_cleanup_done(state) == 0)
> + drm_atomic_state_free(state);
Chris (Cc'ed) added reference counting to atomic state for v4.10, maybe
that already fixes the issue?
Thierry
> }
>
> static void commit_work(struct work_struct *work)
> @@ -1591,12 +1590,12 @@ EXPORT_SYMBOL(drm_atomic_helper_commit_hw_done);
> * This is part of the atomic helper support for nonblocking commits, see
> * drm_atomic_helper_setup_commit() for an overview.
> */
> -void drm_atomic_helper_commit_cleanup_done(struct drm_atomic_state *state)
> +int drm_atomic_helper_commit_cleanup_done(struct drm_atomic_state *state)
> {
> struct drm_crtc *crtc;
> struct drm_crtc_state *crtc_state;
> struct drm_crtc_commit *commit;
> - int i;
> + int i, failed = 0;
> long ret;
>
> for_each_crtc_in_state(state, crtc, crtc_state, i) {
> @@ -1621,15 +1620,19 @@ void drm_atomic_helper_commit_cleanup_done(struct drm_atomic_state *state)
> * not hold a reference of its own. */
> ret = wait_for_completion_timeout(&commit->flip_done,
> 10*HZ);
> - if (ret == 0)
> - DRM_ERROR("[CRTC:%d:%s] flip_done timed out\n",
> + if (ret == 0) {
> + DRM_ERROR("[CRTC:%d:%s] flip_done timed out, memory leaked\n",
> crtc->base.id, crtc->name);
> + failed = -ETIMEDOUT;
> + }
>
> spin_lock(&crtc->commit_lock);
> del_commit:
> list_del(&commit->commit_entry);
> spin_unlock(&crtc->commit_lock);
> }
> +
> + return failed;
> }
> EXPORT_SYMBOL(drm_atomic_helper_commit_cleanup_done);
>
> diff --git a/include/drm/drm_atomic_helper.h b/include/drm/drm_atomic_helper.h
> index 7ff92b09fd9c..ee3d642c1feb 100644
> --- a/include/drm/drm_atomic_helper.h
> +++ b/include/drm/drm_atomic_helper.h
> @@ -88,7 +88,7 @@ int drm_atomic_helper_setup_commit(struct drm_atomic_state *state,
> bool nonblock);
> void drm_atomic_helper_wait_for_dependencies(struct drm_atomic_state *state);
> void drm_atomic_helper_commit_hw_done(struct drm_atomic_state *state);
> -void drm_atomic_helper_commit_cleanup_done(struct drm_atomic_state *state);
> +int drm_atomic_helper_commit_cleanup_done(struct drm_atomic_state *state);
>
> /* implementations for legacy interfaces */
> int drm_atomic_helper_update_plane(struct drm_plane *plane,
>
>
> --
> RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
> FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
> according to speedtest.net.
> _______________________________________________
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20170213/491c70f8/attachment-0001.sig>
More information about the dri-devel
mailing list