[Intel-gfx] ✗ Fi.CI.BAT: failure for Avoid reading OA reports before they land

Umesh Nerlige Ramappa umesh.nerlige.ramappa at intel.com
Wed Jun 7 19:25:03 UTC 2023


On Mon, Jun 05, 2023 at 11:44:21PM +0000, Patchwork wrote:
>   Patch Details
>
>Series:  Avoid reading OA reports before they land
>URL:     [1]https://patchwork.freedesktop.org/series/118886/
>State:   failure
>Details: [2]https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118886v1/index.html
>
>          CI Bug Log - changes from CI_DRM_13232 -> Patchwork_118886v1
>
>Summary
>
>   FAILURE
>
>   Serious unknown changes coming with Patchwork_118886v1 absolutely need to
>   be
>   verified manually.
>
>   If you think the reported changes have nothing to do with the changes
>   introduced in Patchwork_118886v1, please notify your bug team to allow
>   them
>   to document this new failure mode, which will reduce false positives in
>   CI.
>
>   External URL:
>   https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_118886v1/index.html
>
>Participating hosts (37 -> 37)
>
>   Additional (1): bat-rpls-2
>   Missing (1): fi-snb-2520m
>
>Possible new issues
>
>   Here are the unknown changes that may have been introduced in
>   Patchwork_118886v1:
>
>  IGT changes
>
>    Possible regressions
>
>igt at i915_selftest@live at gt_timelines:
>
>          * fi-apl-guc: [3]PASS -> [4]DMESG-WARN +2 similar issues

<3> [309.685038] i915 0000:00:02.0: [drm] *ERROR* Failed to probe lspcon

This warning is not related to OA or any use case from this patch.

>
>    Warnings
>
>igt at kms_psr@sprite_plane_onoff:
>
>          * bat-rplp-1: [5]SKIP ([6]i915#1072) -> [7]ABORT

+ John

These are not related to OA, but a known lockdep issue.

<4>[  229.036305] ======================================================
<4>[  229.036320] WARNING: possible circular locking dependency detected
<4>[  229.036334] 6.4.0-rc5-Patchwork_118886v1-g450d228e3840+ #1 Not tainted
<4>[  229.036348] ------------------------------------------------------
<4>[  229.036362] kworker/0:0H/8 is trying to acquire lock:
<4>[  229.036374] ffff888117b74f48 (&gt->reset.backoff_srcu){++++}-{0:0}, at: _intel_gt_reset_lock+0x0/0x330 [i915]
<4>[  229.036503] but task is already holding lock:
<4>[  229.036521] ffffc900000d3e60 ((work_completion)(&(&guc->timestamp.work)->work)){+.+.}-{0:0}, at: process_one_work+0x1cc/0x510
<4>[  229.036548] which lock already depends on the new lock.

<4>[  229.036574] the existing dependency chain (in reverse order) is:
<4>[  229.036598] -> #3 ((work_completion)(&(&guc->timestamp.work)->work)){+.+.}-{0:0}:
<4>[  229.036624]        lock_acquire+0xd8/0x2d0
<4>[  229.036636]        __flush_work+0x74/0x530
<4>[  229.036646]        __cancel_work_timer+0x14f/0x1f0
<4>[  229.036658]        intel_guc_submission_reset_prepare+0x81/0x4b0 [i915]
<4>[  229.036799]        intel_uc_reset_prepare+0x9c/0x120 [i915]
<4>[  229.036938]        reset_prepare+0x21/0x60 [i915]
<4>[  229.037054]        intel_gt_reset+0x1dd/0x470 [i915]
<4>[  229.037172]        intel_gt_reset_global+0xfb/0x170 [i915]
<4>[  229.037285]        intel_gt_handle_error+0x368/0x420 [i915]
<4>[  229.037401]        intel_gt_debugfs_reset_store+0x5c/0xc0 [i915]
<4>[  229.037509]        i915_wedged_set+0x29/0x40 [i915]
<4>[  229.037600]        simple_attr_write_xsigned.constprop.0+0xb4/0x110
<4>[  229.037616]        full_proxy_write+0x52/0x80
<4>[  229.037627]        vfs_write+0xc5/0x4f0
<4>[  229.037637]        ksys_write+0x64/0xe0
<4>[  229.037646]        do_syscall_64+0x3c/0x90
<4>[  229.037658]        entry_SYSCALL_64_after_hwframe+0x72/0xdc
<4>[  229.037672] -> #2 (&gt->reset.mutex){+.+.}-{3:3}:
<4>[  229.037694]        lock_acquire+0xd8/0x2d0
<4>[  229.037704]        i915_gem_shrinker_taints_mutex+0x31/0x50 [i915]
<4>[  229.037835]        intel_gt_init_reset+0x65/0x80 [i915]
<4>[  229.037948]        intel_gt_common_init_early+0xe1/0x170 [i915]
<4>[  229.038055]        intel_root_gt_init_early+0x48/0x60 [i915]
<4>[  229.038158]        i915_driver_probe+0x243/0xcd0 [i915]
<4>[  229.038247]        i915_pci_probe+0xdc/0x210 [i915]
<4>[  229.038335]        pci_device_probe+0x95/0x120
<4>[  229.038347]        really_probe+0x164/0x3c0
<4>[  229.038358]        __driver_probe_device+0x73/0x160
<4>[  229.038371]        driver_probe_device+0x19/0xa0
<4>[  229.038384]        __driver_attach+0xb6/0x180
<4>[  229.038395]        bus_for_each_dev+0x77/0xd0
<4>[  229.038405]        bus_add_driver+0x114/0x210
<4>[  229.038415]        driver_register+0x5b/0x110
<4>[  229.038425]        0xffffffffa00fd033
<4>[  229.038439]        do_one_initcall+0x57/0x270
<4>[  229.038450]        do_init_module+0x5f/0x220
<4>[  229.038461]        load_module+0x1ca4/0x1f00
<4>[  229.038472]        __do_sys_finit_module+0xb4/0x130
<4>[  229.038484]        do_syscall_64+0x3c/0x90
<4>[  229.038495]        entry_SYSCALL_64_after_hwframe+0x72/0xdc
<4>[  229.038508] -> #1 (fs_reclaim){+.+.}-{0:0}:
<4>[  229.038528]        lock_acquire+0xd8/0x2d0
<4>[  229.038538]        fs_reclaim_acquire+0xac/0xe0
<4>[  229.038550]        __kmem_cache_alloc_node+0x30/0x1b0
<4>[  229.038563]        kmalloc_trace+0x24/0xb0
<4>[  229.039296]        kernfs_fop_open+0xc0/0x3d0
<4>[  229.040028]        do_dentry_open+0x14a/0x440
<4>[  229.040754]        path_openat+0x663/0x8a0
<4>[  229.041480]        do_filp_open+0xb1/0x120
<4>[  229.042030]        do_sys_openat2+0x250/0x330
<4>[  229.042545]        do_sys_open+0x43/0x80
<4>[  229.043107]        do_syscall_64+0x3c/0x90
<4>[  229.043665]        entry_SYSCALL_64_after_hwframe+0x72/0xdc
<4>[  229.044221] -> #0 (/-1493934552){...+}-{0:0}:
<1>[  229.045307] BUG: kernel NULL pointer dereference, address: 0000000000000014
<1>[  229.045852] #PF: supervisor read access in kernel mode
<1>[  229.046390] #PF: error_code(0x0000) - not-present page
<6>[  229.046922] PGD 0 P4D 0 <4>[  229.047460] Oops: 0000 [#1] PREEMPT SMP NOPTI
<4>[  229.048034] CPU: 0 PID: 8 Comm: kworker/0:0H Not tainted 6.4.0-rc5-Patchwork_118886v1-g450d228e3840+ #1
<4>[  229.048629] Hardware name: Intel Corporation Raptor Lake Client Platform/RaptorLake-P LP5 RVP, BIOS RPLPFWI1.R00.3257.A00.2207020323 07/02/2022
<4>[  229.049233] Workqueue: events_highpri guc_timestamp_ping [i915]
<4>[  229.049965] RIP: 0010:print_circular_bug_entry.isra.0+0x44/0x50
<4>[  229.050571] Code: 53 48 89 f3 89 d6 e8 5b 74 01 00 48 8b 7d 00 e8 d2 f3 ff ff 48 c7 c7 65 21 3c 82 e8 46 74 01 00 48 8b 3b ba 06 00 00 00 5b 5d <8b> 77 14 48 83 c7 18 e9 50 d6 04 00 90 90 90 90 90 90 90 90 90 90
<4>[  229.051206] RSP: 0018:ffffc900000d3b68 EFLAGS: 00010046
<4>[  229.051853] RAX: 0000000000000001 RBX: ffff888100d9b3f0 RCX: 0000000000000000
<4>[  229.052506] RDX: 0000000000000006 RSI: ffffffff823ccb57 RDI: 0000000000000000
<4>[  229.053151] RBP: ffff888100d9b3c8 R08: 0000000000000000 R09: ffffc900000d3a10
<4>[  229.053794] R10: 000000000024fd38 R11: 000000000024fda8 R12: 0000000000000000
<4>[  229.054443] R13: ffffc9000256fd00 R14: ffff888100d9a9c0 R15: ffffffff83f8fd40
<4>[  229.055094] FS:  0000000000000000(0000) GS:ffff8882a7000000(0000) knlGS:0000000000000000
<4>[  229.055753] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[  229.056409] CR2: 0000000000000014 CR3: 00000001095b2000 CR4: 0000000000f50ef0
<4>[  229.057069] PKRU: 55555554
<4>[  229.057727] Call Trace:
<4>[  229.058378]  <TASK>
<4>[  229.059023]  ? __die_body+0x1a/0x60
<4>[  229.059671]  ? page_fault_oops+0x156/0x450
<4>[  229.060319]  ? do_user_addr_fault+0x65/0xa10
<4>[  229.060976]  ? exc_page_fault+0x68/0x1a0
<4>[  229.061629]  ? asm_exc_page_fault+0x26/0x30
<4>[  229.062281]  ? print_circular_bug_entry.isra.0+0x44/0x50
<4>[  229.062926]  print_circular_bug.isra.0+0x111/0x3f0
<4>[  229.063536]  check_noncircular+0x131/0x150
<4>[  229.064154]  ? arch_stack_walk+0x87/0xf0
<4>[  229.064759]  check_prev_add+0x90/0xc60
<4>[  229.065363]  __lock_acquire+0x19a3/0x25a0
<4>[  229.065966]  ? startup_64_setup_env+0x184/0xaf0
<4>[  229.066568]  lock_acquire+0xd8/0x2d0
<4>[  229.067173]  ? __pfx__intel_gt_reset_lock+0x10/0x10 [i915]
<4>[  229.067881]  _intel_gt_reset_lock+0x57/0x330 [i915]
<4>[  229.068586]  ? __pfx__intel_gt_reset_lock+0x10/0x10 [i915]
<4>[  229.069288]  guc_timestamp_ping+0x35/0x130 [i915]
<4>[  229.070018]  process_one_work+0x250/0x510
<4>[  229.070629]  worker_thread+0x4f/0x3a0
<4>[  229.071235]  ? __pfx_worker_thread+0x10/0x10
<4>[  229.071845]  kthread+0xff/0x130
<4>[  229.072454]  ? __pfx_kthread+0x10/0x10
<4>[  229.073064]  ret_from_fork+0x29/0x50
<4>[  229.073674]  </TASK>
<4>[  229.074283] Modules linked in: vgem drm_shmem_helper snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hwdep snd_hda_core snd_pcm i915 prime_numbers i2c_algo_bit ttm drm_buddy drm_display_helper drm_kms_helper fuse r8153_ecm cdc_ether usbnet x86_pkg_temp_thermal coretemp kvm_intel kvm e1000e mei_pxp mei_hdcp r8152 irqbypass crct10dif_pclmul crc32_pclmul wmi_bmof mii ghash_clmulni_intel mei_me ptp i2c_i801 mei pps_core i2c_smbus video intel_lpss_pci wmi
<4>[  229.075708] CR2: 0000000000000014
<4>[  229.076421] ---[ end trace 0000000000000000 ]---
<4>[  229.373071] RIP: 0010:print_circular_bug_entry.isra.0+0x44/0x50
<4>[  229.373942] Code: 53 48 89 f3 89 d6 e8 5b 74 01 00 48 8b 7d 00 e8 d2 f3 ff ff 48 c7 c7 65 21 3c 82 e8 46 74 01 00 48 8b 3b ba 06 00 00 00 5b 5d <8b> 77 14 48 83 c7 18 e9 50 d6 04 00 90 90 90 90 90 90 90 90 90 90
<4>[  229.374830] RSP: 0018:ffffc900000d3b68 EFLAGS: 00010046
<4>[  229.375578] RAX: 0000000000000001 RBX: ffff888100d9b3f0 RCX: 0000000000000000
<4>[  229.376235] RDX: 0000000000000006 RSI: ffffffff823ccb57 RDI: 0000000000000000
<4>[  229.376927] RBP: ffff888100d9b3c8 R08: 0000000000000000 R09: ffffc900000d3a10
<4>[  229.377649] R10: 000000000024fd38 R11: 000000000024fda8 R12: 0000000000000000
<4>[  229.378373] R13: ffffc9000256fd00 R14: ffff888100d9a9c0 R15: ffffffff83f8fd40
<4>[  229.379100] FS:  0000000000000000(0000) GS:ffff8882a7000000(0000) knlGS:0000000000000000
<4>[  229.379838] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[  229.380578] CR2: 0000000000000014 CR3: 00000001095b2000 CR4: 0000000000f50ef0
<4>[  229.381331] PKRU: 55555554


>
>Known issues
>


More information about the Intel-gfx mailing list