[PATCH 8/8] drm/xe/device: move xe_device_sanitize over to devm
Andrzej Hajda
andrzej.hajda at intel.com
Mon May 6 17:25:36 UTC 2024
On 29.04.2024 14:14, Matthew Auld wrote:
> Disable GuC submission when removing the device.
>
> Signed-off-by: Matthew Auld <matthew.auld at intel.com>
> Cc: Rodrigo Vivi <rodrigo.vivi at intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda at intel.com>
I have tried to test whole patchset with
core_hotunplug at hotrebind
and I still observe attempts to call xe runtime_pm callbacks from
removed driver(???):
[ 135.059797] xe 0000:00:02.0: [drm:drm_managed_release [drm]] REL
ffff88812b63c240 kmalloc (13 bytes)
[ 135.059844] xe 0000:00:02.0: [drm:drm_managed_release [drm]] REL
ffff88812b63d040 drm_gem_init_release (0 bytes)
[ 135.059887] xe 0000:00:02.0: [drm:drm_managed_release [drm]] REL
ffff88814f73b200 kmalloc (304 bytes)
[ 135.059931] xe 0000:00:02.0: [drm:drm_managed_release [drm]] REL
ffff88812b63c740 drm_minor_alloc_release (8 bytes)
[ 135.059995] xe 0000:00:02.0: [drm:drm_managed_release [drm]] REL
ffff888115620708 kmalloc (40 bytes)
[ 135.060065] xe 0000:00:02.0: [drm:drm_managed_release [drm]] REL
ffff88812b63ca40 drm_minor_alloc_release (8 bytes)
[ 135.060161] xe 0000:00:02.0: [drm:drm_managed_release [drm]] REL
ffff888115620a88 kmalloc (40 bytes)
[ 135.060229] xe 0000:00:02.0: [drm:drm_managed_release [drm]] REL
ffff88812d407e40 drm_dev_init_release (0 bytes)
[ 135.060323] [drm:drm_managed_release [drm]] drmres release end
[ 136.099951] general protection fault, probably for non-canonical
address 0xdffffc00000004a3: 0000 [#1] PREEMPT SMP KASAN NOPTI
[ 136.099969] KASAN: probably user-memory-access in range
[0x0000000000002518-0x000000000000251f]
[ 136.099977] CPU: 0 PID: 9 Comm: kworker/0:1 Not tainted 6.9.0-rc2-xe+ #25
[ 136.099985] Hardware name: Intel Corporation Raptor Lake Client
Platform/RPL-S ADP-S DDR5 UDIMM CRB, BIOS
RPLSFWI1.R00.4064.A02.2302091143 02/09/2023
[ 136.099994] Workqueue: pm pm_runtime_work
[ 136.100010] RIP: 0010:xe_pm_d3cold_allowed_toggle+0x2f/0x2d0 [xe]
[ 136.100209] Code: 48 b8 00 00 00 00 00 fc ff df 55 48 89 e5 41 56 41
55 41 54 53 48 89 fb 48 81 c7 18 25 00 00 48 89 fa 48 c1 ea 03 48 83 ec
08 <0f> b6 04 02 48 89 fa 83 e2 07 38 d0 7f 08 84 c0 0f 85 21 02 00 00
[ 136.100228] RSP: 0018:ffffc900001d7be8 EFLAGS: 00010296
[ 136.100237] RAX: dffffc0000000000 RBX: 0000000000000000 RCX:
0000000000000000
[ 136.100246] RDX: 00000000000004a3 RSI: 0000000000000000 RDI:
0000000000002518
[ 136.100255] RBP: ffffc900001d7c10 R08: ffff888117cf43c9 R09:
ffff888117cf4220
[ 136.100263] R10: 0000000000000000 R11: 0000000000000000 R12:
ffff888117cf40c8
[ 136.100271] R13: 0000000000000000 R14: ffffffff82dc8270 R15:
ffff888117cf43c8
[ 136.100280] FS: 0000000000000000(0000) GS:ffff88884dc00000(0000)
knlGS:0000000000000000
[ 136.100290] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 136.100298] CR2: 00005572f1fff388 CR3: 000000012134e000 CR4:
0000000000750ef0
[ 136.100307] PKRU: 55555554
[ 136.100312] Call Trace:
[ 136.100317] <TASK>
[ 136.100322] ? show_regs+0x71/0x90
[ 136.100331] ? die_addr+0x41/0xc0
[ 136.100339] ? exc_general_protection+0x15d/0x260
[ 136.100352] ? asm_exc_general_protection+0x27/0x30
[ 136.100361] ? __pfx_pci_pm_runtime_idle+0x10/0x10
[ 136.100373] ? xe_pm_d3cold_allowed_toggle+0x2f/0x2d0 [xe]
[ 136.100500] ? __pfx_pci_pm_runtime_idle+0x10/0x10
[ 136.100507] xe_pci_runtime_idle+0x31/0x50 [xe]
[ 136.100629] pci_pm_runtime_idle+0xb7/0x100
[ 136.100637] rpm_idle+0x1ec/0x5c0
[ 136.100646] pm_runtime_work+0x10e/0x170
[ 136.100653] process_one_work+0x855/0x1a20
[ 136.100665] ? __pfx_process_one_work+0x10/0x10
[ 136.100673] ? move_linked_works+0x12b/0x2d0
[ 136.100682] ? assign_work+0x16f/0x280
[ 136.100690] worker_thread+0x57c/0xd40
[ 136.100701] kthread+0x2f3/0x3f0
[ 136.100708] ? __pfx_worker_thread+0x10/0x10
[ 136.100716] ? __pfx_kthread+0x10/0x10
[ 136.100723] ret_from_fork+0x43/0x90
[ 136.100730] ? __pfx_kthread+0x10/0x10
[ 136.100737] ret_from_fork_asm+0x1a/0x30
[ 136.100748] </TASK>
Regards
Andrzej
> ---
> drivers/gpu/drm/xe/xe_device.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_device.c b/drivers/gpu/drm/xe/xe_device.c
> index ba917e383f8f..8c3415d7635a 100644
> --- a/drivers/gpu/drm/xe/xe_device.c
> +++ b/drivers/gpu/drm/xe/xe_device.c
> @@ -395,7 +395,7 @@ static void xe_driver_flr_fini(void *arg)
> xe_driver_flr(xe);
> }
>
> -static void xe_device_sanitize(struct drm_device *drm, void *arg)
> +static void xe_device_sanitize(void *arg)
> {
> struct xe_device *xe = arg;
> struct xe_gt *gt;
> @@ -661,7 +661,7 @@ int xe_device_probe(struct xe_device *xe)
>
> xe_hwmon_register(xe);
>
> - return drmm_add_action_or_reset(&xe->drm, xe_device_sanitize, xe);
> + return devm_add_action_or_reset(xe->drm.dev, xe_device_sanitize, xe);
>
> err_fini_display:
> xe_display_driver_remove(xe);
More information about the Intel-xe
mailing list