✗ CI.xeBAT: failure for runner: Allow dynamically ignore dmesg errors or warnings (rev3)

Kamil Konieczny kamil.konieczny at linux.intel.com
Thu Aug 8 13:25:44 UTC 2024


Hi igt-dev,
On 2024-08-08 at 08:19:33 -0000, Patchwork wrote:
> == Series Details ==
> 
> Series: runner: Allow dynamically ignore dmesg errors or warnings (rev3)
> URL   : https://patchwork.freedesktop.org/series/136494/
> State : failure
> 
> == Summary ==
> 
> CI Bug Log - changes from XEIGT_7962_BAT -> XEIGTPW_11543_BAT
> ====================================================
> 
> Summary
> -------
> 
>   **FAILURE**
> 
>   Serious unknown changes coming with XEIGTPW_11543_BAT absolutely need to be
>   verified manually.
>   
>   If you think the reported changes have nothing to do with the changes
>   introduced in XEIGTPW_11543_BAT, please notify your bug team (I915-ci-infra at lists.freedesktop.org) to allow them
>   to document this new failure mode, which will reduce false positives in CI.
> 
>   
> 
> Participating hosts (7 -> 6)
> ------------------------------
> 
>   Missing    (1): bat-lnl-1 
> 
> Possible new issues
> -------------------
> 
>   Here are the unknown changes that may have been introduced in XEIGTPW_11543_BAT:
> 
> ### IGT changes ###
> 
> #### Possible regressions ####
> 
>   * igt at kms_pipe_crc_basic@compare-crc-sanitycheck-xr24:
>     - bat-adlp-7:         [PASS][1] -> [INCOMPLETE][2] +5 other tests incomplete
>    [1]: https://intel-gfx-ci.01.org/tree/intel-xe/IGT_7962/bat-adlp-7/igt@kms_pipe_crc_basic@compare-crc-sanitycheck-xr24.html
>    [2]: https://intel-gfx-ci.01.org/tree/intel-xe/IGTPW_11543/bat-adlp-7/igt@kms_pipe_crc_basic@compare-crc-sanitycheck-xr24.html
> 
>   * igt at kms_pipe_crc_basic@hang-read-crc at pipe-a-edp-1:
>     - bat-adlp-7:         [PASS][3] -> [DMESG-WARN][4] +2 other tests dmesg-warn
>    [3]: https://intel-gfx-ci.01.org/tree/intel-xe/IGT_7962/bat-adlp-7/igt@kms_pipe_crc_basic@hang-read-crc@pipe-a-edp-1.html
>    [4]: https://intel-gfx-ci.01.org/tree/intel-xe/IGTPW_11543/bat-adlp-7/igt@kms_pipe_crc_basic@hang-read-crc@pipe-a-edp-1.html
> 
>   * igt at xe_live_ktest@xe_migrate:
>     - bat-pvc-2:          [PASS][5] -> [SKIP][6] +2 other tests skip
>    [5]: https://intel-gfx-ci.01.org/tree/intel-xe/IGT_7962/bat-pvc-2/igt@xe_live_ktest@xe_migrate.html
>    [6]: https://intel-gfx-ci.01.org/tree/intel-xe/IGTPW_11543/bat-pvc-2/igt@xe_live_ktest@xe_migrate.html
>     - bat-atsm-2:         [PASS][7] -> [SKIP][8] +2 other tests skip
>    [7]: https://intel-gfx-ci.01.org/tree/intel-xe/IGT_7962/bat-atsm-2/igt@xe_live_ktest@xe_migrate.html
>    [8]: https://intel-gfx-ci.01.org/tree/intel-xe/IGTPW_11543/bat-atsm-2/igt@xe_live_ktest@xe_migrate.html
> 
>   * igt at xe_module_load@load:
>     - bat-dg2-oem2:       [PASS][9] -> [FAIL][10]
>    [9]: https://intel-gfx-ci.01.org/tree/intel-xe/IGT_7962/bat-dg2-oem2/igt@xe_module_load@load.html
>    [10]: https://intel-gfx-ci.01.org/tree/intel-xe/IGTPW_11543/bat-dg2-oem2/igt@xe_module_load@load.html
> 

Above are unrelated.

>   * igt at xe_wedged@basic-wedged:
>     - bat-atsm-2:         NOTRUN -> [DMESG-WARN][11] +1 other test dmesg-warn
>    [11]: https://intel-gfx-ci.01.org/tree/intel-xe/IGTPW_11543/bat-atsm-2/igt@xe_wedged@basic-wedged.html


This looks like a bug in rc2 or in driver:

<3> [765.287417] xe 0000:4d:00.0: [drm] *ERROR* GT0: reset failed (-ECANCELED)
<4> [765.294872] ------------[ cut here ]------------
<4> [765.294881] xe 0000:4d:00.0: [drm] Missing outer runtime PM protection
<4> [765.294945] WARNING: CPU: 18 PID: 11 at drivers/gpu/drm/xe/xe_pm.c:559 xe_pm_runtime_get_noresume+0x48/0x60 [xe]

> 
>   * igt at xe_wedged@wedged-at-any-timeout:
>     - bat-adlp-vf:        NOTRUN -> [ABORT][12]
>    [12]: https://intel-gfx-ci.01.org/tree/intel-xe/IGTPW_11543/bat-adlp-vf/igt@xe_wedged@wedged-at-any-timeout.html
> 

This also looks like a bug in
xe_guc_ads_scheduler_policy_toggle_reset()

<1> [810.005232] BUG: kernel NULL pointer dereference, address: 00000000000003c0
<1> [810.005236] #PF: supervisor read access in kernel mode
<1> [810.005239] #PF: error_code(0x0000) - not-present page
<6> [810.005241] PGD 0 P4D 0 
<4> [810.005243] Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
<4> [810.005246] CPU: 0 UID: 0 PID: 1739 Comm: xe_wedged Tainted: G     U  W          6.11.0-rc2-xe #1
<4> [810.005250] Tainted: [U]=USER, [W]=WARN
<4> [810.005251] Hardware name: Intel Corporation Alder Lake Client Platform/AlderLake-P DDR5 RVP, BIOS RPLPFWI1.R00.4035.A00.2301200723 01/20/2023
<4> [810.005254] RIP: 0010:xe_guc_ads_scheduler_policy_toggle_reset+0x67/0x1f0 [xe]

Regards,
Kamil

>   
> Known issues
> ------------
> 
>   Here are the changes found in XEIGTPW_11543_BAT that come from known issues:
> 
> ### IGT changes ###
> 
> #### Issues hit ####
> 
>   * igt at core_hotunplug@unbind-rebind:
>     - bat-dg2-oem2:       [PASS][13] -> [SKIP][14] ([Intel XE#1885])
>    [13]: https://intel-gfx-ci.01.org/tree/intel-xe/IGT_7962/bat-dg2-oem2/igt@core_hotunplug@unbind-rebind.html
>    [14]: https://intel-gfx-ci.01.org/tree/intel-xe/IGTPW_11543/bat-dg2-oem2/igt@core_hotunplug@unbind-rebind.html
> 
>   * igt at fbdev@eof:
>     - bat-dg2-oem2:       [PASS][15] -> [SKIP][16] ([Intel XE#2134]) +4 other tests skip
>    [15]: https://intel-gfx-ci.01.org/tree/intel-xe/IGT_7962/bat-dg2-oem2/igt@fbdev@eof.html
>    [16]: https://intel-gfx-ci.01.org/tree/intel-xe/IGTPW_11543/bat-dg2-oem2/igt@fbdev@eof.html

...cut...

>   [Intel XE#929]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/929
>   [Intel XE#977]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/977
>   [Intel XE#979]: https://gitlab.freedesktop.org/drm/xe/kernel/issues/979
>   [i915#2575]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/2575
>   [i915#5274]: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/5274
> 
> 
> Build changes
> -------------
> 
>   * IGT: IGT_7962 -> IGTPW_11543
> 
>   IGTPW_11543: 7cdda8ee1d271299366792ab59a3dc72e436fe27 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
>   IGT_7962: 7962
>   xe-1731-4f5d551409fb5562ab0d732120e7ac9b698b5864: 4f5d551409fb5562ab0d732120e7ac9b698b5864
> 
> == Logs ==
> 
> For more details see: https://intel-gfx-ci.01.org/tree/intel-xe/IGTPW_11543/index.html


More information about the igt-dev mailing list