lockdep and ww mutex debug interactions in hdmi tests
Dave Airlie
airlied at gmail.com
Wed Nov 6 20:26:23 UTC 2024
On Fri, 1 Nov 2024 at 23:57, Maxime Ripard <mripard at kernel.org> wrote:
>
> Hi,
>
> On Wed, Oct 30, 2024 at 05:03:50AM +1000, Dave Airlie wrote:
> > Hi,
> >
> > I mentioned this internally, but wanted to get it on the list,
> >
> > I ran the hdmi kunit tests with LOCKDEP and WW_MUTEX_SLOWPATH enabled
> > and hit some issues.
> >
> > With the slowpath we get the occasional EDEADLK to test the paths are
> > doing things right, I think you should handle EDEADLK in the tests
> > with a retry loop.
>
> Thanks for the report, I've just sent a patch fixing this.
The patch fixes the EDEADLK but not the lockdep
[ 50.785446] KTAP version 1
[ 50.785461] 1..2
[ 50.786298] KTAP version 1
[ 50.786305] # Subtest: drm_atomic_helper_connector_hdmi_check
[ 50.786308] # module: drm_hdmi_state_helper_test
[ 50.786312] 1..22
[ 50.788096] ======================================================
[ 50.788101] WARNING: possible circular locking dependency detected
[ 50.788107] 6.12.0-rc6+ #47 Tainted: G N
[ 50.788112] ------------------------------------------------------
[ 50.788117] kunit_try_catch/1500 is trying to acquire lock:
[ 50.788123] ffff9976410cc4f0 (&dev->mode_config.mutex){+.+.}-{3:3},
at: drm_test_check_broadcast_rgb_auto_cea_mode+0xaf/0x4c0
[drm_hdmi_state_helper_test]
[ 50.788141]
but task is already holding lock:
[ 50.788146] ffff9976be5550f0 (crtc_ww_class_acquire){+.+.}-{0:0},
at: drm_kunit_helper_acquire_ctx_alloc+0x4d/0xc0 [drm_kunit_helpers]
[ 50.788159]
which lock already depends on the new lock.
[ 50.788165]
the existing dependency chain (in reverse order) is:
[ 50.788171]
-> #1 (crtc_ww_class_acquire){+.+.}-{0:0}:
[ 50.788179] drm_modeset_acquire_init+0xd7/0x110 [drm]
[ 50.788235]
drm_helper_probe_single_connector_modes+0x4c/0x600 [drm_kms_helper]
[ 50.788266] set_connector_edid.isra.0+0x4f/0xc0
[drm_hdmi_state_helper_test]
[ 50.788275]
drm_atomic_helper_connector_hdmi_init+0x240/0x400
[drm_hdmi_state_helper_test]
[ 50.788285]
drm_test_check_broadcast_rgb_auto_cea_mode+0x27/0x4c0
[drm_hdmi_state_helper_test]
[ 50.788296] kunit_try_run_case+0x62/0xd0 [kunit]
[ 50.788304] kunit_generic_run_threadfn_adapter+0x1e/0x40 [kunit]
[ 50.788313] kthread+0xef/0x120
[ 50.788318] ret_from_fork+0x31/0x50
[ 50.788324] ret_from_fork_asm+0x1a/0x30
[ 50.788329]
-> #0 (&dev->mode_config.mutex){+.+.}-{3:3}:
[ 50.788337] __lock_acquire+0x1391/0x2190
[ 50.788343] lock_acquire+0xcc/0x2d0
[ 50.788348] __mutex_lock+0x8d/0xbf0
[ 50.788353]
drm_test_check_broadcast_rgb_auto_cea_mode+0xaf/0x4c0
[drm_hdmi_state_helper_test]
[ 50.788363] kunit_try_run_case+0x62/0xd0 [kunit]
[ 50.788371] kunit_generic_run_threadfn_adapter+0x1e/0x40 [kunit]
[ 50.788380] kthread+0xef/0x120
[ 50.788384] ret_from_fork+0x31/0x50
[ 50.788388] ret_from_fork_asm+0x1a/0x30
[ 50.788393]
other info that might help us debug this:
[ 50.788400] Possible unsafe locking scenario:
[ 50.788405] CPU0 CPU1
[ 50.788409] ---- ----
[ 50.788413] lock(crtc_ww_class_acquire);
[ 50.788418] lock(&dev->mode_config.mutex);
[ 50.788424] lock(crtc_ww_class_acquire);
[ 50.788431] lock(&dev->mode_config.mutex);
[ 50.788435]
*** DEADLOCK ***
[ 50.788441] 1 lock held by kunit_try_catch/1500:
[ 50.788445] #0: ffff9976be5550f0
(crtc_ww_class_acquire){+.+.}-{0:0}, at:
drm_kunit_helper_acquire_ctx_alloc+0x4d/0xc0 [drm_kunit_helpers]
[ 50.788459]
stack backtrace:
[ 50.788464] CPU: 5 UID: 0 PID: 1500 Comm: kunit_try_catch Tainted:
G N 6.12.0-rc6+ #47
[ 50.788473] Tainted: [N]=TEST
[ 50.788476] Hardware name: Gigabyte Technology Co., Ltd. Z390 I
AORUS PRO WIFI/Z390 I AORUS PRO WIFI-CF, BIOS F8 11/05/2021
[ 50.788485] Call Trace:
[ 50.788488] <TASK>
[ 50.788492] dump_stack_lvl+0x6c/0xa0
[ 50.788498] print_circular_bug.cold+0x178/0x1be
[ 50.788506] check_noncircular+0x10f/0x120
[ 50.788511] ? stack_trace_save+0x3e/0x50
[ 50.788520] __lock_acquire+0x1391/0x2190
[ 50.788528] lock_acquire+0xcc/0x2d0
[ 50.788533] ?
drm_test_check_broadcast_rgb_auto_cea_mode+0xaf/0x4c0
[drm_hdmi_state_helper_test]
[ 50.788544] ? lock_is_held_type+0xd9/0x130
[ 50.788552] __mutex_lock+0x8d/0xbf0
[ 50.788556] ?
drm_test_check_broadcast_rgb_auto_cea_mode+0xaf/0x4c0
[drm_hdmi_state_helper_test]
[ 50.788566] ? _raw_spin_unlock_irqrestore+0x39/0x70
[ 50.788573] ? kunit_add_action+0xd1/0x140 [kunit]
[ 50.788581] ?
drm_test_check_broadcast_rgb_auto_cea_mode+0xaf/0x4c0
[drm_hdmi_state_helper_test]
[ 50.788592] ? __pfx_action_drm_release_context+0x10/0x10 [drm_kunit_helpers]
[ 50.788599] ? __pfx_kunit_generic_run_threadfn_adapter+0x10/0x10 [kunit]
[ 50.788608] ? kunit_add_action_or_reset+0x18/0x40 [kunit]
[ 50.788618] ? __pfx_kunit_generic_run_threadfn_adapter+0x10/0x10 [kunit]
[ 50.788627] ?
drm_test_check_broadcast_rgb_auto_cea_mode+0xaf/0x4c0
[drm_hdmi_state_helper_test]
[ 50.788637] drm_test_check_broadcast_rgb_auto_cea_mode+0xaf/0x4c0
[drm_hdmi_state_helper_test]
[ 50.788647] ? lockdep_hardirqs_on+0x7c/0x100
[ 50.788654] kunit_try_run_case+0x62/0xd0 [kunit]
[ 50.788662] ? lockdep_hardirqs_on+0x7c/0x100
[ 50.788668] ? _raw_spin_unlock_irqrestore+0x39/0x70
[ 50.788675] kunit_generic_run_threadfn_adapter+0x1e/0x40 [kunit]
[ 50.788684] kthread+0xef/0x120
[ 50.788688] ? __pfx_kthread+0x10/0x10
[ 50.788693] ret_from_fork+0x31/0x50
[ 50.788698] ? __pfx_kthread+0x10/0x10
[ 50.788703] ret_from_fork_asm+0x1a/0x30
[ 50.788711] </TASK>
>
> The vc4 have the same issue though, and I haven't been able to fix all
> of them yet.
>
> Maxime
More information about the dri-devel
mailing list