[Intel-xe] [PATCH v3 0/7] PAT and cache coherency support
Souza, Jose
jose.souza at intel.com
Tue Sep 26 18:03:17 UTC 2023
On Tue, 2023-09-26 at 09:23 +0100, Matthew Auld wrote:
> On 25/09/2023 20:47, Souza, Jose wrote:
> > On Mon, 2023-09-25 at 14:21 +0100, Matthew Auld wrote:
> > > Branch available here (lightly tested):
> > > https://gitlab.freedesktop.org/mwa/kernel/-/tree/xe-pat-index?ref_type=heads
> > >
> > > Series still needs some more testing. Also note that the series directly depends
> > > on the WIP patch here: https://patchwork.freedesktop.org/series/122708/
> > >
> > > Goal here is to allow userspace to directly control the pat_index when mapping
> > > memory via the ppGTT, in addtion to the CPU caching mode for system memory. This
> > > is very much needed on newer igpu platforms which allow incoherent GT access,
> > > where the choice over the cache level and expected coherency is best left to
> > > userspace depending on their usecase. In the future there may also be other
> > > stuff encoded in the pat_index, so giving userspace direct control will also be
> > > needed there.
> > >
> > > To support this we added new gem_create uAPI for selecting the CPU cache
> > > mode to use for system memory, including the expected GPU coherency mode. There
> > > are various restrictions here for the selected coherency mode and compatible CPU
> > > cache modes. With that in place the actual pat_index can now be provided as
> > > part of vm_bind. The only restriction is that the coherency mode of the
> > > pat_index must be at least as coherent as the gem_create coherency mode. There
> > > are also some special cases like with userptr and dma-buf.
> > >
> > > v2:
> > > - Loads of improvements/tweaks. Main changes are to now allow
> > > gem_create.coh_mode <= coh_mode(pat_index), rather than it needing to match
> > > exactly. This simplifies the dma-buf policy from userspace pov. Also we now
> > > only consider COH_NONE and COH_AT_LEAST_1WAY.
> > > v3:
> > > - Rebase. Split the pte_encode() refactoring, plus various smaller tweaks and
> > > fixes.
> > >
> >
> > Thanks for the fixes, display is now working in TGL and DG2 but getting a new crash in MTL:
>
> Is the MTL bug present on the same base branch. i.e if you drop all the
> patches in this series?
Also happens without your patches.
Found CI bug with the same signature: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/606
>
> >
> >
> > [ 259.478814] xe 0000:00:02.0: [drm:skl_compute_wm [xe]] [PLANE:31:plane 1A] blocks 16, 97, 97, 129, 129, 161, 0, 0, 30, 33, 47 -> 62,
> > 93, 93, 123, 123, 154, 0, 0, 137, 62, 137
> > [ 259.478936] xe 0000:00:02.0: [drm:skl_compute_wm [xe]] [PLANE:31:plane 1A] min_ddb 19, 108, 108, 143, 143, 179, 0, 0, 31, 38, 48 -> 123,
> > 184, 184, 184, 184, 245, 0, 0, 138, 123, 138
> > [ 259.479089] ------------[ cut here ]------------
> > [ 259.479093] WARNING: CPU: 2 PID: 2057 at drivers/gpu/drm/xe/display/xe_fb_pin.c:199 __xe_pin_fb_vma+0x3dc/0x840 [xe]
> > [ 259.479239] Modules linked in: xe drm_ttm_helper drm_exec gpu_sched drm_suballoc_helper i2c_algo_bit drm_buddy ttm drm_display_helper
> > x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul wmi_bmof pmt_telemetry pmt_class ghash_clmulni_intel snd_hda_intel snd_intel_dspcfg
> > snd_hda_codec kvm_intel snd_hwdep snd_hda_core e1000e mei_me ptp snd_pcm i2c_i801 mei i2c_smbus pps_core intel_vsec video wmi pinctrl_meteorlake fuse
> > [ 259.479327] CPU: 2 PID: 2057 Comm: gnome-shell Tainted: G W 6.5.0-rc7+zeh-xe+ #1109
> > [ 259.479333] Hardware name: Intel Corporation Meteor Lake Client Platform/MTL-M LP5x CONF1 RVP, BIOS MTLMFWI1.R00.3323.D84.2308220916 08/22/2023
> > [ 259.479337] RIP: 0010:__xe_pin_fb_vma+0x3dc/0x840 [xe]
> > [ 259.479498] Code: 4d 89 f4 48 8b 44 24 08 49 8d b4 24 28 03 00 00 b9 16 00 00 00 4c 89 60 08 48 8d 78 10 f3 48 a5 4c 8b 6c 24 08 e9 2c fd ff ff
> > <0f> 0b 49 c7 c5 ed ff ff ff e9 14 fd ff ff 48 8b 7c 24 28 89 14 24
> > [ 259.479503] RSP: 0018:ffffc9000604bb88 EFLAGS: 00010246
> > [ 259.479509] RAX: ffff888196c9f190 RBX: ffff8881a222dc00 RCX: 0000000000000001
> > [ 259.479513] RDX: 0000000000000000 RSI: ffffffff826a896e RDI: ffffffff826ac710
> > [ 259.479517] RBP: ffff888183823800 R08: 0000000000000128 R09: ffff8881b2eff4d8
> > [ 259.479521] R10: ffffc9000604bac8 R11: 0000000000000002 R12: ffff8881a222dc00
> > [ 259.479526] R13: ffff888102ab0000 R14: 0000000000000000 R15: 0000563e9575fa00
> > [ 259.479530] FS: 00007f4dbdf5f5c0(0000) GS:ffff88846e100000(0000) knlGS:0000000000000000
> > [ 259.479535] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 259.479540] CR2: 00000e17254bc000 CR3: 0000000117bb8005 CR4: 0000000000770ee0
> > [ 259.479545] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [ 259.479549] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400
> > [ 259.479552] PKRU: 55555554
> > [ 259.479556] Call Trace:
> > [ 259.479560] <TASK>
> > [ 259.479564] ? __xe_pin_fb_vma+0x3dc/0x840 [xe]
> > [ 259.479708] ? __warn+0x7c/0x170
> > [ 259.479716] ? __xe_pin_fb_vma+0x3dc/0x840 [xe]
> > [ 259.479855] ? report_bug+0x18d/0x1c0
> > [ 259.479865] ? handle_bug+0x3a/0x70
> > [ 259.479873] ? exc_invalid_op+0x13/0x60
> > [ 259.479880] ? asm_exc_invalid_op+0x16/0x20
> > [ 259.479894] ? __xe_pin_fb_vma+0x3dc/0x840 [xe]
> > [ 259.480030] ? __xe_pin_fb_vma+0x34/0x840 [xe]
> > [ 259.480160] ? lock_acquire+0xd3/0x2d0
> > [ 259.480170] ? find_held_lock+0x2b/0x80
> > [ 259.480179] intel_plane_pin_fb+0x34/0x90 [xe]
> > [ 259.480314] intel_prepare_plane_fb+0x2c/0x70 [xe]
> > [ 259.480469] drm_atomic_helper_prepare_planes+0x6b/0x210
> > [ 259.480481] intel_atomic_commit+0x4d/0x360 [xe]
> > [ 259.480666] drm_mode_atomic_ioctl+0x7c7/0xbd0
> > [ 259.480688] ? __pfx_drm_mode_atomic_ioctl+0x10/0x10
> > [ 259.480696] drm_ioctl_kernel+0xc0/0x170
> > [ 259.480705] drm_ioctl+0x212/0x470
> > [ 259.480711] ? __pfx_drm_mode_atomic_ioctl+0x10/0x10
> > [ 259.480729] __x64_sys_ioctl+0x8d/0xb0
> > [ 259.480739] do_syscall_64+0x38/0x90
> > [ 259.480746] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
> > [ 259.480752] RIP: 0033:0x7f4dc211aaff
> > [ 259.480756] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05
> > <41> 89 c0 3d 00 f0 ff ff 77 1f 48 8b 44 24 18 64 48 2b 04 25 28 00
> > [ 259.480761] RSP: 002b:00007ffd6d2b7940 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> > [ 259.480767] RAX: ffffffffffffffda RBX: 00007ffd6d2b79e0 RCX: 00007f4dc211aaff
> > [ 259.480771] RDX: 00007ffd6d2b79e0 RSI: 00000000c03864bc RDI: 0000000000000009
> > [ 259.480774] RBP: 00000000c03864bc R08: 0000000000000000 R09: 0000000000000000
> > [ 259.480777] R10: 00007f4dc221a2f0 R11: 0000000000000246 R12: 0000563e988b4590
> > [ 259.480781] R13: 0000000000000009 R14: 0000563e988b4650 R15: 0000563e9814f060
> > [ 259.480794] </TASK>
> > [ 259.480797] irq event stamp: 2049057
> > [ 259.480800] hardirqs last enabled at (2049063): [<ffffffff811e2369>] __up_console_sem+0x59/0x80
> > [ 259.480808] hardirqs last disabled at (2049068): [<ffffffff811e234e>] __up_console_sem+0x3e/0x80
> > [ 259.480815] softirqs last enabled at (2048430): [<ffffffff8114f3aa>] irq_exit_rcu+0x8a/0xe0
> > [ 259.480821] softirqs last disabled at (2048423): [<ffffffff8114f3aa>] irq_exit_rcu+0x8a/0xe0
> > [ 259.480826] ---[ end trace 0000000000000000 ]---
> > [ 259.494838] xe 0000:00:02.0: [drm:drm_mode_addfb2] [FB:219]
> > [ 259.494943] xe 0000:00:02.0: [drm:skl_compute_wm [xe]] [PLAN
> >
> > That is: __xe_pin_fb_vma()
> > if (XE_WARN_ON(view->type == I915_GTT_VIEW_REMAPPED)) {
> >
> >
> >
More information about the Intel-xe
mailing list