Re: ✗ Xe.CI.Full: failure for SR-IOV promotions
Michal Wajdeczko
michal.wajdeczko at intel.com
Wed Jul 23 14:38:28 UTC 2025
On 7/23/2025 3:25 AM, Patchwork wrote:
> *Patch Details*
> *Series:* SR-IOV promotions
> *URL:* https://patchwork.freedesktop.org/series/151961/ <https://patchwork.freedesktop.org/series/151961/>
> *State:* failure
> *Details:* https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-151961v1/index.html <https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-151961v1/index.html>
>
>
> CI Bug Log - changes from xe-3458-1bb52fd2f94e5c605adf88a086ab0eadcdd708e3_FULL -> xe-pw-151961v1_FULL
>
>
> Summary
>
> *FAILURE*
>
> Serious unknown changes coming with xe-pw-151961v1_FULL absolutely need to be
> verified manually.
>
> If you think the reported changes have nothing to do with the changes
> introduced in xe-pw-151961v1_FULL, please notify your bug team (I915-ci-infra at lists.freedesktop.org) to allow them
> to document this new failure mode, which will reduce false positives in CI.
>
>
> Participating hosts (4 -> 4)
>
> No changes in participating hosts
>
>
> Possible new issues
>
> Here are the unknown changes that may have been introduced in xe-pw-151961v1_FULL:
>
>
> IGT changes
>
>
> Possible regressions
>
> *
>
> igt at xe_exec_threads@threads-cm-fd-userptr-invalidate:
>
> o shard-adlp: PASS <https://intel-gfx-ci.01.org/tree/intel-xe/xe-3458-1bb52fd2f94e5c605adf88a086ab0eadcdd708e3/shard-adlp-3/igt@xe_exec_threads@threads-cm-fd-userptr-invalidate.html> -> FAIL <https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-151961v1/shard-adlp-1/igt@xe_exec_threads@threads-cm-fd-userptr-invalidate.html> +1 other test fail
unrelated, looks like sporadic
<7> [301.708094] xe 0000:00:02.0: [drm:xe_guc_exec_queue_memory_cat_error_handler [xe]] GT0: Engine memory CAT error: class=vcs, logical_mask: 0x1, guc_id=6
<6> [301.708849] xe 0000:00:02.0: [drm] GT0: Engine reset: engine_class=vcs, logical_mask: 0x1, guc_id=6
<7> [301.708957] xe 0000:00:02.0: [drm:xe_devcoredump [xe]] Multiple hangs are occurring, but only the first snapshot was taken
> *
>
> igt at xe_fault_injection@inject-fault-probe-function-xe_guc_log_init:
>
> o shard-bmg: PASS <https://intel-gfx-ci.01.org/tree/intel-xe/xe-3458-1bb52fd2f94e5c605adf88a086ab0eadcdd708e3/shard-bmg-2/igt@xe_fault_injection@inject-fault-probe-function-xe_guc_log_init.html> -> ABORT <https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-151961v1/shard-bmg-8/igt@xe_fault_injection@inject-fault-probe-function-xe_guc_log_init.html>
>
unrelated, looks like sporadic, previously tracked as Xe Bug# 3084
<4> [265.225514] pci 0000:03:00.0: [drm] Assertion `ret` failed!
platform: BATTLEMAGE subplatform: 1
graphics: Xe2_HPG 20.01 step A0
media: Xe2_HPM 13.01 step A1
tile: 0 VRAM 12.0 GiB
GT: 0 type 1
<4> [265.225582] WARNING: CPU: 0 PID: 8409 at drivers/gpu/drm/xe/xe_guc_submit.c:242 guc_submit_fini+0x2b6/0x2d0 [xe]
...
<3> [265.227376] pci 0000:03:00.0: [drm] *ERROR* GT0: GUC ID manager unclean (1/65535)
<6> [265.228262] pci 0000:03:00.0: [drm] GT0: total 65535
<6> [265.228272] pci 0000:03:00.0: [drm] GT0: used 1
<6> [265.228278] pci 0000:03:00.0: [drm] GT0: range 5..5 (1)
...
<3> [265.233717] [drm:drm_mm_takedown] *ERROR* node [0070d000 + 00007000]: inserted at
drm_mm_insert_node_in_range+0x2b4/0x520
__xe_ggtt_insert_bo_at+0x159/0x4b0 [xe]
xe_ggtt_insert_bo+0x17/0x30 [xe]
__xe_bo_create_locked+0x2a5/0x620 [xe]
xe_bo_create_pin_map_at_aligned+0x49/0x1e0 [xe]
xe_bo_create_pin_map+0x1c/0x30 [xe]
xe_lrc_create+0x178/0x1990 [xe]
xe_exec_queue_create+0x339/0x520 [xe]
xe_exec_queue_create_ioctl+0xb2f/0xc90 [xe]
>
> Warnings
>
> * igt at xe_pm@s2idle-d3cold-basic-exec:
> o shard-adlp: SKIP <https://intel-gfx-ci.01.org/tree/intel-xe/xe-3458-1bb52fd2f94e5c605adf88a086ab0eadcdd708e3/shard-adlp-3/igt@xe_pm@s2idle-d3cold-basic-exec.html> (Intel XE#2284 <https://gitlab.freedesktop.org/drm/xe/kernel/issues/2284> / Intel XE#366 <https://gitlab.freedesktop.org/drm/xe/kernel/issues/366>) -> ABORT <https://intel-gfx-ci.01.org/tree/intel-xe/xe-pw-151961v1/shard-adlp-1/igt@xe_pm@s2idle-d3cold-basic-exec.html>
unrelated, looks that we lost MMIO
<3> [321.997510] xe 0000:00:02.0: [drm] *ERROR* GT0: GuC mmio request 0x5507: no reply 0xffff
<3> [321.998641] xe 0000:00:02.0: [drm] *ERROR* GT0: GuC suspend failed: -ETIMEDOUT
<3> [321.998721] xe 0000:00:02.0: [drm] *ERROR* GT0: suspend failed (-ETIMEDOUT)
<3> [321.999159] xe 0000:00:02.0: can't suspend (xe_pci_runtime_suspend [xe] returned -110)
More information about the Intel-xe
mailing list