[PATCH v4 0/2] Fixes for MI_REPORT_PERF_COUNT

Souza, Jose jose.souza at intel.com
Fri Dec 20 16:16:45 UTC 2024


On Thu, 2024-12-19 at 16:22 -0800, Umesh Nerlige Ramappa wrote:
> OA programming sequence for query mode or MI_REPORT_PERF_COUNT requires
> modifying some HW registers in the same hw context as the user exec
> queue. User passes the exec_queue to the OA interface and OA
> implementation submits an MI_LOAD_REGISTER_IMM to this queue to modify
> the registers.
> 
> The OA implementation submits a batch mapped in GGTT to the user exec
> queue and hence, some plumbing is added into relevant code to enable
> that (as per suggestions from Matthew Brost).
> 
> v2: review rework
> v3:
> - review rework
> - original patches squashed for porting to stable
> - code cleanup
> 
> v4:
> - review rework/fixes

Got this oops with this version:

[  176.066578] xe 0000:00:02.0: [drm:xe_oa_config_locked [xe]] changed to oa config uuid=4ccd6535-fb9a-440f-b0f5-882879dc4cb0
[  176.068577] xe 0000:00:02.0: [drm:xe_oa_config_locked [xe]] changed to oa config uuid=4ccd6535-fb9a-440f-b0f5-882879dc4cb0
[  176.072629] xe 0000:00:02.0: [drm:xe_oa_config_locked [xe]] changed to oa config uuid=4ccd6535-fb9a-440f-b0f5-882879dc4cb0
[  176.078117] xe 0000:00:02.0: [drm:xe_oa_config_locked [xe]] changed to oa config uuid=4ccd6535-fb9a-440f-b0f5-882879dc4cb0
[  176.081285] xe 0000:00:02.0: [drm:xe_oa_config_locked [xe]] changed to oa config uuid=4ccd6535-fb9a-440f-b0f5-882879dc4cb0
[  176.093564] xe 0000:00:02.0: [drm:xe_oa_config_locked [xe]] changed to oa config uuid=4ccd6535-fb9a-440f-b0f5-882879dc4cb0
[  176.102886] xe 0000:00:02.0: [drm:xe_oa_config_locked [xe]] changed to oa config uuid=4ccd6535-fb9a-440f-b0f5-882879dc4cb0
[  194.119229] Oops: general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6ba3: 0000 [#1] PREEMPT SMP
[  194.130187] CPU: 3 UID: 1000 PID: 2240 Comm: ReplayManager Not tainted 6.13.0-rc3-zeh-xe+ #1454
[  194.138931] Hardware name: Intel Corporation Lunar Lake Client Platform/LNL-M LP5 RVP1, BIOS LNLMFWI1.R00.3152.D83.2404190622 04/19/2024
[  194.151258] RIP: 0010:xe_sync_entry_add_deps+0x1c/0x60 [xe]
[  194.157013] Code: c7 43 18 f4 ff ff ff e9 9b fe ff ff 66 90 55 53 48 8b 5f 08 48 85 db 75 05 31 c0 5b 5d c3 48 89 f5 48 8d 7b 38 b8 01 00 00 00
<f0> 0f c1 43 38 85 c0 74 20 8d 50 01 09 c2 78 0d 48 89 de 48 89 ef
[  194.175863] RSP: 0018:ffffc90001f93de8 EFLAGS: 00010202
[  194.181136] RAX: 0000000000000001 RBX: 6b6b6b6b6b6b6b6b RCX: 0000000000000000
[  194.188331] RDX: ffff88815ee8edc0 RSI: ffff88814ebb0840 RDI: 6b6b6b6b6b6b6ba3
[  194.195520] RBP: ffff88814ebb0840 R08: 0000000000000001 R09: 0000000000000000
[  194.202707] R10: 0000000000000001 R11: 0000000000000003 R12: ffff88814ebb0840
[  194.209889] R13: ffff8881457f9900 R14: ffff888173075800 R15: 0000000000000000
[  194.217071] FS:  00007f6c80db9640(0000) GS:ffff88885e580000(0000) knlGS:0000000000000000
[  194.225216] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  194.231014] CR2: 00007f6bdb33a000 CR3: 0000000144f44001 CR4: 0000000000772ef0
[  194.238201] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  194.245386] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400
[  194.252575] PKRU: 55555554
[  194.255315] Call Trace:
[  194.257794]  <TASK>
[  194.259932]  ? __die_body.cold+0x19/0x21
[  194.263899]  ? die_addr+0x33/0x50
[  194.267256]  ? exc_general_protection+0x19e/0x450
[  194.272002]  ? asm_exc_general_protection+0x22/0x30
[  194.276930]  ? xe_sync_entry_add_deps+0x1c/0x60 [xe]
[  194.282052]  xe_oa_submit_bb.constprop.0+0x9d/0x1c0 [xe]
[  194.287517]  xe_oa_load_with_lri.constprop.0+0xc4/0x130 [xe]
[  194.293313]  xe_oa_configure_oa_context+0x1fd/0x210 [xe]
[  194.298770]  xe_oa_disable_metric_set+0x4b/0xc0 [xe]
[  194.303857]  xe_oa_stream_destroy+0x3a/0x140 [xe]
[  194.308698]  xe_oa_release+0x3a/0xe0 [xe]
[  194.312833]  __fput+0xee/0x2a0
[  194.315934]  __x64_sys_close+0x49/0xb0
[  194.319722]  do_syscall_64+0x64/0x130
[  194.323417]  entry_SYSCALL_64_after_hwframe+0x4b/0x53
[  194.328511] RIP: 0033:0x7f6ca8b14f8b
[  194.332130] Code: 03 00 00 00 0f 05 48 3d 00 f0 ff ff 77 41 c3 48 83 ec 18 89 7c 24 0c e8 73 ba f7 ff 8b 7c 24 0c 41 89 c0 b8 03 00 00 00 0f 05
<48> 3d 00 f0 ff ff 77 35 44 89 c7 89 44 24 0c e8 c1 ba f7 ff 8b 44
[  194.350996] RSP: 002b:00007f6c80db7f10 EFLAGS: 00000202 ORIG_RAX: 0000000000000003
[  194.358628] RAX: ffffffffffffffda RBX: 00007f6c344d2f84 RCX: 00007f6ca8b14f8b
[  194.365810] RDX: 0000000000000000 RSI: 00000000c00864c0 RDI: 0000000000000035
[  194.373008] RBP: 00007f6c80db7f40 R08: 0000000000000000 R09: 0000000000000000
[  194.380193] R10: 000000000000000a R11: 0000000000000202 R12: 00007f6ca43e98c0
[  194.387393] R13: 00007f6c001691b0 R14: 0000000000000710 R15: 0000000000255aa0
[  194.394603]  </TASK>
[  194.396874] Modules linked in: snd_hda_codec_hdmi xe drm_ttm_helper gpu_sched drm_suballoc_helper drm_gpuvm drm_exec i2c_algo_bit drm_buddy
drm_display_helper ttm drm_kms_helper mei_gsc_proxy wmi_bmof x86_pkg_temp_thermal coretemp crct10dif_pclmul snd_hda_intel crc32_pclmul
snd_intel_dspcfg ghash_clmulni_intel snd_hda_codec snd_hwdep snd_hda_core kvm_intel e1000e mei_me ptp snd_pcm pps_core mei video intel_pmc_core
pmt_telemetry pmt_class wmi intel_vsec dm_multipath fuse
[  194.438795] ---[ end trace 0000000000000000 ]---
[  194.518144] RIP: 0010:xe_sync_entry_add_deps+0x1c/0x60 [xe]
[  194.523871] Code: c7 43 18 f4 ff ff ff e9 9b fe ff ff 66 90 55 53 48 8b 5f 08 48 85 db 75 05 31 c0 5b 5d c3 48 89 f5 48 8d 7b 38 b8 01 00 00 00
<f0> 0f c1 43 38 85 c0 74 20 8d 50 01 09 c2 78 0d 48 89 de 48 89 ef
[  194.542724] RSP: 0018:ffffc90001f93de8 EFLAGS: 00010202
[  194.548000] RAX: 0000000000000001 RBX: 6b6b6b6b6b6b6b6b RCX: 0000000000000000
[  194.555208] RDX: ffff88815ee8edc0 RSI: ffff88814ebb0840 RDI: 6b6b6b6b6b6b6ba3
[  194.562398] RBP: ffff88814ebb0840 R08: 0000000000000001 R09: 0000000000000000
[  194.569597] R10: 0000000000000001 R11: 0000000000000003 R12: ffff88814ebb0840
[  194.576787] R13: ffff8881457f9900 R14: ffff888173075800 R15: 0000000000000000
[  194.583989] FS:  00007f6c80db9640(0000) GS:ffff88885e580000(0000) knlGS:0000000000000000
[  194.592158] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  194.597962] CR2: 00007f6bdb33a000 CR3: 0000000144f44001 CR4: 0000000000772ef0
[  194.605184] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  194.612405] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400
[  194.619595] PKRU: 55555554


> 
> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa at intel.com>
> 
> Umesh Nerlige Ramappa (2):
>   xe/oa: Fix query mode of operation for OAR/OAC
>   xe/oa: Drop the unused logic to parse context image
> 
>  drivers/gpu/drm/xe/xe_oa.c              | 222 +++++-------------------
>  drivers/gpu/drm/xe/xe_oa_types.h        |   3 -
>  drivers/gpu/drm/xe/xe_ring_ops.c        |   5 +-
>  drivers/gpu/drm/xe/xe_sched_job_types.h |   2 +
>  4 files changed, 52 insertions(+), 180 deletions(-)
> 



More information about the Intel-xe mailing list