[RFC 00/20] First attempt to kill mem_access

Vivi, Rodrigo rodrigo.vivi at intel.com
Wed Jan 10 14:08:23 UTC 2024


On Wed, 2024-01-10 at 09:06 -0500, Rodrigo Vivi wrote:
> On Wed, Jan 10, 2024 at 05:21:34AM +0000, Matthew Brost wrote:
> > On Wed, Dec 27, 2023 at 09:12:12PM -0500, Rodrigo Vivi wrote:
> > > At first the mem_access seemed a good idea since it would ensure
> > > we could map every memory access and apply some workarounds and
> > > then use that to ensure that the device is awake.
> > > 
> > > However it has become a nightmare in locking conflicts with
> > > memory
> > > locking. The only sane way to go is to move the runtime_pm
> > > protection
> > > to the outer bounds and ensure that the device is resumed way
> > > before memory locking.
> > > 
> > > So, this RFC here is the first attempt to kill the mem access and
> > > have a clean rpm handling on the outer bounds.
> > > 
> > > Well, at this time we already know that we need to solve some TLB
> > > invalidation issues and the last patch in this series needs to
> > > be split in smaller pieces. But I'd like to at lest get
> > > the discussion started.
> > > 
> > > Happy New Year,
> > > Rodrigo.
> > > 
> > 
> > Hi Rodrigo - I haven't fully reviewed everything but noticed a few
> > issues to discuss.
> 
> +Auld, who was also raising very similar concerns.
(actually doing it)

> 
> > 
> > 1. LR mode VMs
> >         - I don't think the PM refs taken for LR jobs works. LR
> > job's hw
> >           fence is signal immediately after scheduling the job to
> > the
> >           hardware. Once the hw fence is signalled, the job can be
> >           typically be freed.
> >         - How about we just take a PM reference when a LR VM is
> > opened?
> 
> I like this idea!
> 
> > 
> > 2. Tearing down exec queues
> >         - Tearing down exec queues requires a ping-ping with the
> > GuC
> >           which likely needs PM ref
> 
> would the idea of getting with the CT that expects G2H help here as
> well?
> (calling CT-expecting-G2H-ref now on)
> 
> > 
> > 3. Schedule enable G2H
> >         - First job on an exec queue will issue schedule enable H2G
> >           which results in a G2H. This G2H could be recieved after
> > the
> >           job is freed
> 
> for this, the CT-expecting-G2H-ref would be enough right?
> 
> > 
> > 4. TLB Invalidations
> >         - Send H2G, receive G2H when done
> 
> for this, the CT-expecting-G2H-ref would be enough right?
> 
> >         - Four cases
> >                 a) From a (un)bind job
> >                         - Job can free before invalidation issued /
> >                           complete
> 
> hmm... I believe I have faced this at some point.
> would the CT-expecting-G2H help here?
> or any other idea to cover this case?
> 
> >                 b) GGTT invalidations
> >                         - BO creation, should be covered by IOCTL
> > PM ref
> 
> this should be okay then.
> 
> >                 c) Userptr invalidation / BO move on LR VM
> >                         - should be covered by #1 if LR VM take PM
> > ref
> >                 d) Page fault handler
> >                         - should be covered by #1 if LR VM take PM
> > ref
> > 
> 
> these (c and d) would be okay with the LR-VM ref, right?
> 
> > 5. SRIOV Relay?
> >         - Haven't looked into this all might have issues here too?
> 
> would this be covered as well with the CT-expecting-G2H-ref?
> or any big hammer needed of blocking rpm anytime that we have a VF
> maybe?
> 
> > 
> > 2, 3, 4a all are H2G waiting on G2H. Perhaps it is simplest to
> > build the
> > PM references into the CT layer? A lower layer but off the top my
> > head
> > not seeing a better option really.
> > 
> > e.g. A CT send that expects a G2H takes a PM ref with the caveat we
> > expect the device to already have a PM ref. The receive can drop
> > the PM
> > ref and it can transition to zero.
> 
> one extra reason to keep the lockdep checks, but that should be okay
> I believe. I will try it here.
> 
> > 
> > Thoughts?
> 
> Basically it looks that we need:
> 1. Get back the lockdep
> 2. Add a big hammer around LR-VM (outer bound refs at VM creation and
> destruction if LR)
> 3. Add an inner rpm get around CT messages who expect G2H back
> messages and put on G2H responses.
> For this, do you have any good idea for the right places and
> conditions for the proper balance?
> and to ensure that we don't keep holding the ref forever in case of
> never getting the response...
> 
> Anything else that I might be missing?
> 
> Thank you all for all the great comments and suggestions!
> 
> > 
> > Matt
> > 
> > > Rodrigo Vivi (20):
> > >   drm/xe: Document Xe PM component
> > >   drm/xe: Fix display runtime_pm handling
> > >   drm/xe: Create a xe_pm_runtime_resume_and_get variant for
> > > display
> > >   drm/xe: Convert xe_pm_runtime_{get,put} to void and protect
> > > from
> > >     recursion
> > >   drm/xe: Prepare display for D3Cold
> > >   drm/xe: Convert mem_access assertion towards the runtime_pm
> > > state
> > >   drm/xe: Runtime PM wake on every IOCTL
> > >   drm/xe: Runtime PM wake on every exec
> > >   drm/xe: Runtime PM wake on every sysfs call
> > >   drm/xe: Sort some xe_pm_runtime related functions
> > >   drm/xe: Ensure device is awake before removing it
> > >   drm/xe: Remove mem_access from guc_pc calls
> > >   drm/xe: Runtime PM wake on every debugfs call
> > >   drm/xe: Replace dma_buf mem_access per direct xe_pm_runtime
> > > calls
> > >   drm/xe: Allow GuC CT fast path and worker regardless of
> > > runtime_pm
> > >   drm/xe: Remove mem_access calls from migration
> > >   drm/xe: Removing extra mem_access protection from runtime pm
> > >   drm/xe: Convert hwmon from mem_access to xe_pm_runtime calls
> > >   drm/xe: Remove unused runtime pm helper
> > >   drm/xe: Mega Kill of mem_access
> > > 
> > >  .../gpu/drm/xe/compat-i915-headers/i915_drv.h |   8 +-
> > >  drivers/gpu/drm/xe/display/xe_fb_pin.c        |   7 +-
> > >  drivers/gpu/drm/xe/tests/xe_bo.c              |   8 -
> > >  drivers/gpu/drm/xe/tests/xe_migrate.c         |   2 -
> > >  drivers/gpu/drm/xe/tests/xe_mocs.c            |   4 -
> > >  drivers/gpu/drm/xe/xe_bo.c                    |   5 -
> > >  drivers/gpu/drm/xe/xe_debugfs.c               |  10 +-
> > >  drivers/gpu/drm/xe/xe_device.c                | 129 ++++-------
> > >  drivers/gpu/drm/xe/xe_device.h                |   9 -
> > >  drivers/gpu/drm/xe/xe_device_sysfs.c          |   4 +
> > >  drivers/gpu/drm/xe/xe_device_types.h          |   9 -
> > >  drivers/gpu/drm/xe/xe_dma_buf.c               |   5 +-
> > >  drivers/gpu/drm/xe/xe_exec_queue.c            |  18 --
> > >  drivers/gpu/drm/xe/xe_ggtt.c                  |   6 -
> > >  drivers/gpu/drm/xe/xe_gsc.c                   |   3 -
> > >  drivers/gpu/drm/xe/xe_gt.c                    |  17 --
> > >  drivers/gpu/drm/xe/xe_gt_debugfs.c            |  53 ++++-
> > >  drivers/gpu/drm/xe/xe_gt_freq.c               |  38 +++-
> > >  drivers/gpu/drm/xe/xe_gt_idle.c               |  23 +-
> > >  drivers/gpu/drm/xe/xe_gt_throttle_sysfs.c     |   3 +
> > >  drivers/gpu/drm/xe/xe_guc_ct.c                |  40 ----
> > >  drivers/gpu/drm/xe/xe_guc_debugfs.c           |   9 +-
> > >  drivers/gpu/drm/xe/xe_guc_pc.c                |  62 +----
> > >  drivers/gpu/drm/xe/xe_huc_debugfs.c           |   5 +-
> > >  drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c |  58 ++++-
> > >  drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.h |   7 +
> > >  drivers/gpu/drm/xe/xe_hwmon.c                 |  25 ++-
> > >  drivers/gpu/drm/xe/xe_pat.c                   |  10 -
> > >  drivers/gpu/drm/xe/xe_pci.c                   |   2 +-
> > >  drivers/gpu/drm/xe/xe_pm.c                    | 211
> > > ++++++++++++++----
> > >  drivers/gpu/drm/xe/xe_pm.h                    |   9 +-
> > >  drivers/gpu/drm/xe/xe_query.c                 |   4 -
> > >  drivers/gpu/drm/xe/xe_sched_job.c             |  10 +-
> > >  drivers/gpu/drm/xe/xe_tile.c                  |  10 +-
> > >  drivers/gpu/drm/xe/xe_tile_sysfs.c            |   1 +
> > >  drivers/gpu/drm/xe/xe_ttm_sys_mgr.c           |   5 +-
> > >  drivers/gpu/drm/xe/xe_vm.c                    |   7 -
> > >  37 files changed, 445 insertions(+), 391 deletions(-)
> > > 
> > > -- 
> > > 2.43.0
> > > 



More information about the Intel-xe mailing list