[PATCH] drm/xe/debugfs: Display gpuva info in debugfs

Fri Sep 20 18:43:22 UTC 2024

On Fri, Sep 20, 2024 at 09:57:19AM -0600, Zeng, Oak wrote:
> 
> 
> > -----Original Message-----
> > From: Brost, Matthew <matthew.brost at intel.com>
> > Sent: Thursday, September 19, 2024 3:02 PM
> > To: Zeng, Oak <oak.zeng at intel.com>
> > Cc: intel-xe at lists.freedesktop.org
> > Subject: Re: [PATCH] drm/xe/debugfs: Display gpuva info in debugfs
> > 
> > On Wed, Sep 18, 2024 at 12:45:45PM -0400, Oak Zeng wrote:
> > > Add debugfs entry to display all the gpuva that is bound to the
> > > device. This is done by walking all the VMs created for each device
> > > process, and display va information of each VM. This is useful for
> > > gpuvm debugging. With this change, user can display gpuva info
> > > as below:
> > >
> > > root at DUT7279PVC:/home/gta# cat
> > /sys/kernel/debug/dri/0/gpuvas
> > > Process "your process name" GPU VA space
> > > DRM GPU VA space (Xe VM)
> > [0x0000000000000000;0x0200000000000000]
> > > Kernel reserved node [0x0000000000000000;0x0000000000000000]
> > >
> > >  VAs | start              | range              | end                | object             |
> > object offset
> > > -----------------------------------------------------------------------------------
> > --------------------------
> > >      | 0x0000000000000000 | 0x00007ffff5db7000 |
> > 0x00007ffff5db7000 | 0x0000000000000000 | 0x0000000000000000
> > >      | 0x00007ffff5db7000 | 0x0000000000001000 |
> > 0x00007ffff5db8000 | 0x0000000000000000 | 0x00007ffff5db7000
> > >      | 0x00007ffff5db8000 | 0x00ff80000a248000 |
> > 0x0100000000000000 | 0x0000000000000000 | 0x0000000000000000
> > >
> > 
> > This looks useful. We may even want to extend wedged mode to
> > switch the
> > VM to a state where this is available after a hang which causes a
> > wedge
> > as this could useful. Can be done in a follow up later.
> > 
> > > Signed-off-by: Oak Zeng <oak.zeng at intel.com>
> > > ---
> > >  drivers/gpu/drm/xe/xe_debugfs.c | 44
> > +++++++++++++++++++++++++++++++++
> > >  1 file changed, 44 insertions(+)
> > >
> > > diff --git a/drivers/gpu/drm/xe/xe_debugfs.c
> > b/drivers/gpu/drm/xe/xe_debugfs.c
> > > index 1011e5d281fa..0c7bee1c2a8d 100644
> > > --- a/drivers/gpu/drm/xe/xe_debugfs.c
> > > +++ b/drivers/gpu/drm/xe/xe_debugfs.c
> > > @@ -9,6 +9,7 @@
> > >  #include <linux/string_helpers.h>
> > >
> > >  #include <drm/drm_debugfs.h>
> > > +#include <drm/drm_drv.h>
> > >
> > >  #include "xe_bo.h"
> > >  #include "xe_device.h"
> > > @@ -84,9 +85,52 @@ static int sriov_info(struct seq_file *m, void
> > *data)
> > >  	return 0;
> > >  }
> > >
> > > +static int show_vm_gpuvas(struct xe_vm *vm, struct seq_file *m)
> > > +{
> > > +	int ret;
> > > +
> > > +	mutex_lock(&vm->snap_mutex);
> > > +	ret = drm_debugfs_gpuva_info(m, &vm->gpuvm);
> > > +	mutex_unlock(&vm->snap_mutex);
> > > +
> > > +	return ret;
> > > +}
> > > +
> > > +static int show_each_vm(struct seq_file *m, void *arg)
> > > +{
> > > +	struct drm_info_node *node = (struct drm_info_node *)m-
> > >private;
> > > +	struct xe_device *xe = node_to_xe(node);
> > > +	struct drm_device *dev = &xe->drm;
> > > +	struct drm_file *file;
> > > +	struct xe_file *xef;
> > > +	int (*show)(struct xe_vm *, struct seq_file *) = node-
> > >info_ent->data;
> > > +	struct xe_vm *vm;
> > > +	unsigned long idx;
> > > +	int ret = 0;
> > > +
> > > +	mutex_lock(&dev->filelist_mutex);
> > > +	list_for_each_entry(file, &dev->filelist, lhead) {
> > > +		xef = (struct xe_file *)file->driver_priv;
> > > +		seq_printf(m, "Process %s GPU VA space\n", xef-
> > >process_name);
> > > +		mutex_lock(&xef->vm.lock);
> > > +		xa_for_each(&xef->vm.xa, idx, vm) {
> > 
> > Kinda a nit but I've said this in a few other places - the xef->vm.lock
> > read usage isn't right here. This look protects the xarray lookup and
> > ref to VM and nothing else. It is not designed to do anything else -
> > e.g. holding the lock then calling another function.
> 
> I don't see a problem calling a function holding xef->vm.lock. we require
> Vm to be alive while calling show_vm_gpuvas, and xef->vm.lock guarantees
> Vm is alive because it guarantee vm is on the xarray and as long as it is on
> The xarray we hold a vm ref.
> 

Your code works, it is about having well defined locks and not abusing
them.

To repeat - this lock protects the lookup and taking a ref. It is not
intended to be taken and do other things underneath it, certain not
intended to be an outer lock of sorts.

> > 
> > With that this loop should look like this.
> > 
> > mutex_lock(&xef->vm.lock);
> > xa_for_each(&xef->vm.xa, idx, vm) {
> 
> I don't think xa_for_each is safe againt removing items from xarray, 
> At least this is my understanding.
> 

It is [1]. The lock protects the ref to VM so after lookup it is not
immediately freed it does not protect the lookup itself, xa has locking
for that.

[1] https://elixir.bootlin.com/linux/v6.11/source/include/linux/xarray.h#L491

> > 	xe_vm_get(vm);
> > 	mutex_unlock(&xef->vm.lock);
> > 
> > 	/* Do something */
> > 
> > 	mutex_lock(&xef->vm.lock);
> > 	xe_vm_put(vm);
> > }
> > mutex_unlock(&xef->vm.lock);
> > 
> > This avoids unintentionally creating locking chains, in this case
> > xef->vm.lock -> snap_mutex. We should strive to only create well
> > defined
> > changes of locking. Once we start unintentionally creating locking
> > chains it hard to unwind in a driver.
> 
> Can we annotate the xef->vm.lock -> snap_mutex locking order?
> It is reasonable to me from the locking hierarchy here: xef->vm.lock
> Is at the xe file level, and snap_mutex is at vm level.  we can create
> multiple vms to one xe file, so the locking order seems natural to me.
> 
> the key point is, not be afraid of creating locking chain, instead, creating
> locking chain at reasonable order. Fortunately, the locking order in this
> case is very obvious.
> 

We could annotate it but I'd rather just avoid a chain. In general
creating chains is a bad idea and great way to have locking which can't
be understood or get inversions.

Again this works but it fairly easy to avoid creating this chain so
let's not do it.

The only place we really have chains in Xe is in memory management code
(VM and dma-resv locks) which absolutely required are very well defined.

> > 
> > Likewise I suspect we really shouldn't be holding dev->filelist_mutex
> > and doing things underneath it. Fixing that would likely require
> > converting &dev->filelist to an xarray or writing an iterator with a
> > hitch though so is a bit of a larger change. Looking at the usage of
> > this lock, I'd say this is kinda abusing this lock too and I don't think
> > anyone in the community would be all that happy about its usage
> > here.
> > I'd lean towards we need to properly fix this, or alternatively maybe
> > just maintain a per device storage of all the VMs.
> 
> Take a look at drm_client_info function. Holding filelist_mutex and doing

I don't see that function [2].

[2] https://elixir.bootlin.com/linux/v6.11/A/ident/drm_client_info`

I see drm_clients_info...

That function does take fpriv->master_lookup_lock in
drm_is_current_master but that IMO is well defined and within the DRM
layer too. We also shouldn't take a lock from different layer and make
rules of what we can with it.

I guess AMD and vmwgfx [3] take this lock in kinda similar ways but IMO
that is a bit abusive, in particular this usage [4].

[3] https://elixir.bootlin.com/linux/v6.11/A/ident/filelist_mutex
[4] https://elixir.bootlin.com/linux/v6.11/source/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c#L1771

> Things underneath, including hold another lock, seems fine. That codes have
> Been there for many years without a problem. It is just natural to hold

Again I think it works but that is not my point.

> Filelist_mutex to walk device's file list as this lock is designed to protect the
> File list. I don't think it is a abusing of this lock.
> 
> So I think it is reasonable to annotate a dev filelist_mutext->xef vm.lock->vm snap_mutex
> Locking order. This is the same order of creating device ->device file descriptor for device->gpuvm
> For a device file descriptor. Any reversed order locking is not reasonable thus a kernel driver
> Bug.
>

I disagree here and think it easy to have a well defined xarray + lock
of VMs attached to the device with same semantics as xef->vm.lock.

Now that I'm looking, we already have that xe_device.usm.asid*. Maybe
just repurpose that structure / lock to be non-usm specific (e.g. always
assign a VM a ASID and insert into it). Your loop here then would look
like the on described in my previous reply with taking a ref to the VM
and then dropping the lock for main part of the implementation. That
seems better.

Matt

> Oak
> 
> > 
> > Matt
> > 
> > > +			ret = show(vm, m);
> > > +			if (ret < 0)
> > > +				break;
> > > +
> > > +			seq_puts(m, "\n");
> > > +		}
> > > +		mutex_unlock(&xef->vm.lock);
> > > +	}
> > > +	mutex_unlock(&dev->filelist_mutex);
> > > +
> > > +	return ret;
> > > +}
> > > +
> > >  static const struct drm_info_list debugfs_list[] = {
> > >  	{"info", info, 0},
> > >  	{ .name = "sriov_info", .show = sriov_info, },
> > > +	DRM_DEBUGFS_GPUVA_INFO(show_each_vm,
> > show_vm_gpuvas),
> > >  };
> > >
> > >  static int forcewake_open(struct inode *inode, struct file *file)
> > > --
> > > 2.26.3
> > >