[RFC 00/20] First attempt to kill mem_access

Matthew Brost matthew.brost at intel.com
Wed Jan 10 05:21:34 UTC 2024


On Wed, Dec 27, 2023 at 09:12:12PM -0500, Rodrigo Vivi wrote:
> At first the mem_access seemed a good idea since it would ensure
> we could map every memory access and apply some workarounds and
> then use that to ensure that the device is awake.
> 
> However it has become a nightmare in locking conflicts with memory
> locking. The only sane way to go is to move the runtime_pm protection
> to the outer bounds and ensure that the device is resumed way
> before memory locking.
> 
> So, this RFC here is the first attempt to kill the mem access and
> have a clean rpm handling on the outer bounds.
> 
> Well, at this time we already know that we need to solve some TLB
> invalidation issues and the last patch in this series needs to
> be split in smaller pieces. But I'd like to at lest get
> the discussion started.
> 
> Happy New Year,
> Rodrigo.
> 

Hi Rodrigo - I haven't fully reviewed everything but noticed a few
issues to discuss.

1. LR mode VMs
	- I don't think the PM refs taken for LR jobs works. LR job's hw
	  fence is signal immediately after scheduling the job to the
	  hardware. Once the hw fence is signalled, the job can be
	  typically be freed.
	- How about we just take a PM reference when a LR VM is opened?

2. Tearing down exec queues
	- Tearing down exec queues requires a ping-ping with the GuC
	  which likely needs PM ref

3. Schedule enable G2H
	- First job on an exec queue will issue schedule enable H2G
	  which results in a G2H. This G2H could be recieved after the
	  job is freed

4. TLB Invalidations
	- Send H2G, receive G2H when done
	- Four cases
		a) From a (un)bind job
			- Job can free before invalidation issued /
			  complete
		b) GGTT invalidations
			- BO creation, should be covered by IOCTL PM ref
		c) Userptr invalidation / BO move on LR VM
			- should be covered by #1 if LR VM take PM ref
		d) Page fault handler
			- should be covered by #1 if LR VM take PM ref

5. SRIOV Relay?
	- Haven't looked into this all might have issues here too?

2, 3, 4a all are H2G waiting on G2H. Perhaps it is simplest to build the
PM references into the CT layer? A lower layer but off the top my head
not seeing a better option really.

e.g. A CT send that expects a G2H takes a PM ref with the caveat we
expect the device to already have a PM ref. The receive can drop the PM
ref and it can transition to zero.

Thoughts?

Matt

> Rodrigo Vivi (20):
>   drm/xe: Document Xe PM component
>   drm/xe: Fix display runtime_pm handling
>   drm/xe: Create a xe_pm_runtime_resume_and_get variant for display
>   drm/xe: Convert xe_pm_runtime_{get,put} to void and protect from
>     recursion
>   drm/xe: Prepare display for D3Cold
>   drm/xe: Convert mem_access assertion towards the runtime_pm state
>   drm/xe: Runtime PM wake on every IOCTL
>   drm/xe: Runtime PM wake on every exec
>   drm/xe: Runtime PM wake on every sysfs call
>   drm/xe: Sort some xe_pm_runtime related functions
>   drm/xe: Ensure device is awake before removing it
>   drm/xe: Remove mem_access from guc_pc calls
>   drm/xe: Runtime PM wake on every debugfs call
>   drm/xe: Replace dma_buf mem_access per direct xe_pm_runtime calls
>   drm/xe: Allow GuC CT fast path and worker regardless of runtime_pm
>   drm/xe: Remove mem_access calls from migration
>   drm/xe: Removing extra mem_access protection from runtime pm
>   drm/xe: Convert hwmon from mem_access to xe_pm_runtime calls
>   drm/xe: Remove unused runtime pm helper
>   drm/xe: Mega Kill of mem_access
> 
>  .../gpu/drm/xe/compat-i915-headers/i915_drv.h |   8 +-
>  drivers/gpu/drm/xe/display/xe_fb_pin.c        |   7 +-
>  drivers/gpu/drm/xe/tests/xe_bo.c              |   8 -
>  drivers/gpu/drm/xe/tests/xe_migrate.c         |   2 -
>  drivers/gpu/drm/xe/tests/xe_mocs.c            |   4 -
>  drivers/gpu/drm/xe/xe_bo.c                    |   5 -
>  drivers/gpu/drm/xe/xe_debugfs.c               |  10 +-
>  drivers/gpu/drm/xe/xe_device.c                | 129 ++++-------
>  drivers/gpu/drm/xe/xe_device.h                |   9 -
>  drivers/gpu/drm/xe/xe_device_sysfs.c          |   4 +
>  drivers/gpu/drm/xe/xe_device_types.h          |   9 -
>  drivers/gpu/drm/xe/xe_dma_buf.c               |   5 +-
>  drivers/gpu/drm/xe/xe_exec_queue.c            |  18 --
>  drivers/gpu/drm/xe/xe_ggtt.c                  |   6 -
>  drivers/gpu/drm/xe/xe_gsc.c                   |   3 -
>  drivers/gpu/drm/xe/xe_gt.c                    |  17 --
>  drivers/gpu/drm/xe/xe_gt_debugfs.c            |  53 ++++-
>  drivers/gpu/drm/xe/xe_gt_freq.c               |  38 +++-
>  drivers/gpu/drm/xe/xe_gt_idle.c               |  23 +-
>  drivers/gpu/drm/xe/xe_gt_throttle_sysfs.c     |   3 +
>  drivers/gpu/drm/xe/xe_guc_ct.c                |  40 ----
>  drivers/gpu/drm/xe/xe_guc_debugfs.c           |   9 +-
>  drivers/gpu/drm/xe/xe_guc_pc.c                |  62 +----
>  drivers/gpu/drm/xe/xe_huc_debugfs.c           |   5 +-
>  drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c |  58 ++++-
>  drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.h |   7 +
>  drivers/gpu/drm/xe/xe_hwmon.c                 |  25 ++-
>  drivers/gpu/drm/xe/xe_pat.c                   |  10 -
>  drivers/gpu/drm/xe/xe_pci.c                   |   2 +-
>  drivers/gpu/drm/xe/xe_pm.c                    | 211 ++++++++++++++----
>  drivers/gpu/drm/xe/xe_pm.h                    |   9 +-
>  drivers/gpu/drm/xe/xe_query.c                 |   4 -
>  drivers/gpu/drm/xe/xe_sched_job.c             |  10 +-
>  drivers/gpu/drm/xe/xe_tile.c                  |  10 +-
>  drivers/gpu/drm/xe/xe_tile_sysfs.c            |   1 +
>  drivers/gpu/drm/xe/xe_ttm_sys_mgr.c           |   5 +-
>  drivers/gpu/drm/xe/xe_vm.c                    |   7 -
>  37 files changed, 445 insertions(+), 391 deletions(-)
> 
> -- 
> 2.43.0
> 


More information about the Intel-xe mailing list