[PATCH] drm/xe: Use xe_pm_runtime_get() in xe_ggtt_remove_node()
Rodrigo Vivi
rodrigo.vivi at intel.com
Thu Jul 11 20:14:43 UTC 2024
On Thu, Jul 11, 2024 at 01:00:31PM -0700, José Roberto de Souza wrote:
> I don't see a relationship between drm_dev_enter() and pm_runtime.
> A plugged device could still no one holding a PM refcount.
>
> And this is being triggered from ttm_bo_delayed_delete() and I can't
> see no one in the call chain getting a runtime pm before
> xe_ggtt_remove_node(), so here replacing xe_pm_runtime_get_noresume()
> by xe_pm_runtime_get().
>
> This change probably will fix the kernel OOPS below:
It will remove this and create a lockdep splat.
The right solution is this series:
https://lore.kernel.org/intel-xe/20240711171155.173717-12-rodrigo.vivi@intel.com/T/#u
Please help with review there.
>
> ------------[ cut here ]------------
> xe 0000:4d:00.0: [drm] Missing outer runtime PM protection
> WARNING: CPU: 100 PID: 3524 at drivers/gpu/drm/xe/xe_pm.c:551 xe_pm_runtime_get_noresume+0x48/0x60 [xe]
> Modules linked in: snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd soundcore mei_gsc xe drm_gpuvm video drm_ttm_helper ttm gpu_sched drm_suballoc_helper drm_exec drm_display_helper drm_kunit_helpers kunit drm_buddy intel_rapl_msr intel_rapl_common cmdlinepart spi_nor mtd intel_uncore_frequency intel_uncore_frequency_common i10nm_edac nls_iso8859_1 nfit x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 aesni_intel crypto_simd cryptd rndis_host ast cdc_ether i2c_algo_bit dm_multipath dax_hmem i40e ixgbe scsi_dh_rdac drm_shmem_helper usbnet mei_me cxl_acpi scsi_dh_emc rapl scsi_dh_alua mii drm_kms_helper intel_cstate mdio cxl_core e1000e libie efi_pstore i2c_i801 intel_pch_thermal spi_intel_pci mei isst_if_mbox_pci i2c_smbus isst_if_mmio spi_intel isst_if_common intel_th_gth intel_th_pci ipmi_ssif ioatdma intel_vsec intel_th dca wmi ipmi_si
> acpi_power_meter acpi_ipmi ipmi_devintf acpi_pad ipmi_msghandler mac_hid sch_fq_codel msr parport_pc ppdev lp parport drm ip_tables x_tables autofs4
> CPU: 100 PID: 3524 Comm: kworker/u580:4 Not tainted 6.10.0-rc5-xe #1
> Hardware name: Intel Corporation WHITLEY/WHITLEY, BIOS SE5C6200.86B.0027.P15.2205121306 05/12/2022
> Workqueue: ttm ttm_bo_delayed_delete [ttm]
> RIP: 0010:xe_pm_runtime_get_noresume+0x48/0x60 [xe]
> Code: cc cc cc 48 8b 7b 08 4c 8b 67 50 4d 85 e4 75 03 4c 8b 27 e8 aa bd f4 e0 4c 89 e2 48 c7 c7 d8 1a 03 a1 48 89 c6 e8 08 b7 32 e0 <0f> 0b 48 8b 43 08 f0 ff 80 f8 02 00 00 5b 41 5c 5d c3 cc cc cc cc
> RSP: 0000:ffa00000225afc00 EFLAGS: 00010282
> RAX: 0000000000000000 RBX: ff1100014c510000 RCX: 0000000000000027
> RDX: 0000000000000027 RSI: 0000000000000000 RDI: ff1100103fe31a48
> RBP: ffa00000225afc10 R08: 0000000000000001 R09: 0000000000000000
> R10: 0000000000000001 R11: 632da25ec9e647d2 R12: ff1100011d93b710
> R13: ff1100016cf6c448 R14: 0000000000000001 R15: ff1100014c510000
> FS: 0000000000000000(0000) GS:ff1100103fe00000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f4d34e18198 CR3: 000000000aa54006 CR4: 0000000000771ef0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> PKRU: 55555554
> Call Trace:
> <TASK>
> ? show_regs+0x67/0x70
> ? __warn+0x94/0x1b0
> ? xe_pm_runtime_get_noresume+0x48/0x60 [xe]
> ? report_bug+0x1b7/0x1d0
> ? handle_bug+0x46/0x80
> ? exc_invalid_op+0x19/0x70
> ? asm_exc_invalid_op+0x1b/0x20
> ? xe_pm_runtime_get_noresume+0x48/0x60 [xe]
> xe_ggtt_remove_node+0x99/0x110 [xe]
> xe_ggtt_remove_bo+0x59/0x1d0 [xe]
> ? _raw_write_unlock+0x23/0x50
> ? drm_vma_offset_remove+0x66/0x80 [drm]
> xe_ttm_bo_destroy+0x135/0x230 [xe]
> ttm_bo_release+0x6e/0x320 [ttm]
> ttm_bo_delayed_delete+0x82/0xa0 [ttm]
> process_scheduled_works+0x3aa/0x750
> worker_thread+0x14f/0x2f0
> ? __pfx_worker_thread+0x10/0x10
> kthread+0xf5/0x130
> ? __pfx_kthread+0x10/0x10
> ret_from_fork+0x39/0x60
> ? __pfx_kthread+0x10/0x10
> ret_from_fork_asm+0x1a/0x30
> </TASK>
> irq event stamp: 26249
> hardirqs last enabled at (26255): [<ffffffff811b8f51>] vprintk_emit+0x351/0x360
> hardirqs last disabled at (26260): [<ffffffff811b8f33>] vprintk_emit+0x333/0x360
> softirqs last enabled at (25326): [<ffffffff810f04bf>] handle_softirqs+0x30f/0x430
> softirqs last disabled at (25319): [<ffffffff810f0e09>] irq_exit_rcu+0x89/0xb0
> ---[ end trace 0000000000000000 ]---
>
> Cc: Rodrigo Vivi <rodrigo.vivi at intel.com>
> Signed-off-by: José Roberto de Souza <jose.souza at intel.com>
> ---
> drivers/gpu/drm/xe/xe_ggtt.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_ggtt.c b/drivers/gpu/drm/xe/xe_ggtt.c
> index 0cdbc1296e885..13ce0f51f517a 100644
> --- a/drivers/gpu/drm/xe/xe_ggtt.c
> +++ b/drivers/gpu/drm/xe/xe_ggtt.c
> @@ -489,7 +489,7 @@ void xe_ggtt_remove_node(struct xe_ggtt *ggtt, struct drm_mm_node *node,
>
> bound = drm_dev_enter(&xe->drm, &idx);
> if (bound)
> - xe_pm_runtime_get_noresume(xe);
> + xe_pm_runtime_get(xe);
>
> mutex_lock(&ggtt->lock);
> if (bound)
> --
> 2.45.2
>
More information about the Intel-xe
mailing list