Regression on drm-tip

Borah, Chaitanya Kumar chaitanya.kumar.borah at intel.com
Mon Apr 28 06:02:43 UTC 2025


Hello Christopher,

This mail is regarding a regression we are seeing in our CI runs[1] on drm-tip[2] repository.

`````````````````````````````````````````````````````````````````````````````````
<4>[    7.891028] =============================
<4>[    7.891293] [ BUG: Invalid wait context ]
<4>[    7.891526] 6.15.0-rc3-CI_DRM_16443-gdc80d6a10c1c+ #1 Tainted: G        W          
<4>[    7.891792] -----------------------------
<4>[    7.892070] (udev-worker)/286 is trying to lock:
<4>[    7.892349] ffff88811671bcc8 (&adapter->ptm_lock){....}-{3:3}, at: igc_ptp_reset+0x155/0x320 [igc]
<4>[    7.892660] other info that might help us debug this:
<4>[    7.892943] context-{4:4}
<4>[    7.893226] 2 locks held by (udev-worker)/286:
<4>[    7.893515]  #0: ffff888103bd41b0 (&dev->mutex){....}-{3:3}, at: __driver_attach+0x104/0x220
<4>[    7.893823]  #1: ffff88811671bb70 (&adapter->tmreg_lock){....}-{2:2}, at: igc_ptp_reset+0x53/0x320 [igc]
<4>[    7.894134] stack backtrace:
<4>[    7.894439] CPU: 2 UID: 0 PID: 286 Comm: (udev-worker) Tainted: G        W           6.15.0-rc3-CI_DRM_16443-gdc80d6a10c1c+ #1 PREEMPT(voluntary) 
<4>[    7.894442] Tainted: [W]=WARN
<4>[    7.894443] Hardware name: Intel(R) Client Systems NUC11TNHi3/NUC11TNBi3, BIOS TNTGL357.0067.2022.0718.1742 07/18/2022
`````````````````````````````````````````````````````````````````````````````````
Detailed log can be found in [3].

After bisecting the tree, the following patch [4] seems to be the first "bad"
commit

`````````````````````````````````````````````````````````````````````````````````````````````````````````
commit 1a931c4f5e6862e61a4b130cb76b422e1415f644
Author: Christopher S M Hall mailto:christopher.s.hall at intel.com
Date:   Tue Apr 1 16:35:34 2025 -0700

    igc: add lock preventing multiple simultaneous PTM transactions
`````````````````````````````````````````````````````````````````````````````````````````````````````````

We also verified that if we revert the patch the issue is not seen.

Could you please check why the patch causes this regression and provide a fix if necessary?

Thank you.

Regards

Chaitanya

[1] https://intel-gfx-ci.01.org/tree/drm-tip/shard-tglu.html
[2] https://cgit.freedesktop.org/drm-tip/tree/
[3] https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_16443/fi-tgl-1115g4/boot0.txt
[4] https://cgit.freedesktop.org/drm-tip/commit/?id=1a931c4f5e6862e61a4b130cb76b422e1415f644



More information about the Intel-xe mailing list