Re: ✗ Xe.CI.Full: failure for Update "force_reset" code (rev3)
Michal Wajdeczko
michal.wajdeczko at intel.com
Fri May 30 14:27:08 UTC 2025
On 30.05.2025 11:31, Patchwork wrote:
> == Series Details ==
>
> Series: Update "force_reset" code (rev3)
> URL : https://patchwork.freedesktop.org/series/149607/
> State : failure
>
> == Summary ==
>
> CI Bug Log - changes from XEIGT_8384_FULL -> XEIGTPW_13211_FULL
> ====================================================
>
> Summary
> -------
>
> **FAILURE**
>
> Serious unknown changes coming with XEIGTPW_13211_FULL absolutely need to be
> verified manually.
>
> If you think the reported changes have nothing to do with the changes
> introduced in XEIGTPW_13211_FULL, please notify your bug team (I915-ci-infra at lists.freedesktop.org) to allow them
> to document this new failure mode, which will reduce false positives in CI.
>
>
>
> Participating hosts (4 -> 3)
> ------------------------------
>
> Missing (1): shard-adlp
>
> Possible new issues
> -------------------
>
> Here are the unknown changes that may have been introduced in XEIGTPW_13211_FULL:
>
> ### IGT changes ###
>
> #### Possible regressions ####
>
> * igt at xe_exec_reset@parallel-gt-reset:
> - shard-bmg: [PASS][1] -> [DMESG-WARN][2]
> [1]: https://intel-gfx-ci.01.org/tree/intel-xe/IGT_8384/shard-bmg-4/igt@xe_exec_reset@parallel-gt-reset.html
> [2]: https://intel-gfx-ci.01.org/tree/intel-xe/IGTPW_13211/shard-bmg-1/igt@xe_exec_reset@parallel-gt-reset.html
hmm, quite unexpected
<7>[ 307.897676] xe 0000:03:00.0: [drm:pf_queue_work_func [xe]]
ASID: 101
VFID: 0
PDATA: 0x0450
Faulted Address: 0x00007ba9d02b7000
FaultType: 0
AccessType: 1
FaultLevel: 3
EngineClass: 1 vcs
EngineInstance: 0
<7>[ 307.898010] xe 0000:03:00.0: [drm:pf_queue_work_func [xe]] Fault
response: Unsuccessful -22
<6>[ 307.898235] xe 0000:03:00.0: [drm] GT1: reset done
<7>[ 307.898301] xe 0000:03:00.0: [drm:pf_queue_work_func [xe]]
ASID: 101
VFID: 0
PDATA: 0x0451
Faulted Address: 0x0000793b1cacc000
FaultType: 0
AccessType: 0
FaultLevel: 3
EngineClass: 1 vcs
EngineInstance: 2
<7>[ 307.898523] xe 0000:03:00.0: [drm:xe_hw_engine_snapshot_capture
[xe]] GT1: Proceeding with manual engine snapshot
<7>[ 307.898598] xe 0000:03:00.0: [drm:pf_queue_work_func [xe]] Fault
response: Unsuccessful -22
<7>[ 307.898620] xe 0000:03:00.0:
[drm:xe_guc_exec_queue_memory_cat_error_handler [xe]] GT1: Engine memory
cat error: engine_class=vcs, logical_mask: 0x1, guc_id=2
<7>[ 307.899910] xe 0000:03:00.0:
[drm:xe_guc_exec_queue_memory_cat_error_handler [xe]] GT1: Engine memory
cat error: engine_class=vcs, logical_mask: 0x1, guc_id=3
given than the only difference from this patch is the way how we trigger
the reset, before it was "show" ops:
<6> [207.735904] [IGT] xe_exec_reset: starting subtest parallel-gt-reset
<6> [207.773630] xe 0000:03:00.0: [drm] GT1: trying reset from
force_reset_show [xe]
now it's "write" ops:
<6> [307.847469] [IGT] xe_exec_reset: starting subtest parallel-gt-reset
<6> [307.873879] xe 0000:03:00.0: [drm] GT1: trying reset from
force_reset_write [xe]
>
> * igt at xe_pm@s4-d3hot-basic-exec:
> - shard-bmg: [PASS][3] -> [ABORT][4]
> [3]: https://intel-gfx-ci.01.org/tree/intel-xe/IGT_8384/shard-bmg-6/igt@xe_pm@s4-d3hot-basic-exec.html
> [4]: https://intel-gfx-ci.01.org/tree/intel-xe/IGTPW_13211/shard-bmg-6/igt@xe_pm@s4-d3hot-basic-exec.html
unrelated to xe driver
<4> [373.000446] ======================================================
<4> [373.000460] WARNING: possible circular locking dependency detected
<4> [373.000475] 6.15.0-xe+ #1 Tainted: G U N
<4> [373.000490] ------------------------------------------------------
<4> [373.000503] kworker/u64:66/5057 is trying to acquire lock:
<4> [373.000518] ffffffff838b45a8 (rtnl_mutex){+.+.}-{3:3}, at:
rtnl_lock+0x17/0x30
<4> [373.000559]
but task is already holding lock:
<4> [373.000572] ffff8881149a3438 (&tp->control){+.+.}-{3:3}, at:
rtl8152_resume+0x26/0xd0 [r8152]
<4> [373.000612]
which lock already depends on the new lock.
More information about the igt-dev
mailing list