[PATCH] drm/xe/debugfs: Display gpuva info in debugfs

Fri Sep 20 20:45:17 UTC 2024

On Fri, Sep 20, 2024 at 02:03:54PM -0600, Zeng, Oak wrote:
> 
> 
> > -----Original Message-----
> > From: Brost, Matthew <matthew.brost at intel.com>
> > Sent: Friday, September 20, 2024 2:43 PM
> > To: Zeng, Oak <oak.zeng at intel.com>
> > Cc: intel-xe at lists.freedesktop.org
> > Subject: Re: [PATCH] drm/xe/debugfs: Display gpuva info in debugfs
> > 
> > On Fri, Sep 20, 2024 at 09:57:19AM -0600, Zeng, Oak wrote:
> > >
> > >
> > > > -----Original Message-----
> > > > From: Brost, Matthew <matthew.brost at intel.com>
> > > > Sent: Thursday, September 19, 2024 3:02 PM
> > > > To: Zeng, Oak <oak.zeng at intel.com>
> > > > Cc: intel-xe at lists.freedesktop.org
> > > > Subject: Re: [PATCH] drm/xe/debugfs: Display gpuva info in
> > debugfs
> > > >
> > > > On Wed, Sep 18, 2024 at 12:45:45PM -0400, Oak Zeng wrote:
> > > > > Add debugfs entry to display all the gpuva that is bound to the
> > > > > device. This is done by walking all the VMs created for each
> > device
> > > > > process, and display va information of each VM. This is useful for
> > > > > gpuvm debugging. With this change, user can display gpuva info
> > > > > as below:
> > > > >
> > > > > root at DUT7279PVC:/home/gta# cat
> > > > /sys/kernel/debug/dri/0/gpuvas
> > > > > Process "your process name" GPU VA space
> > > > > DRM GPU VA space (Xe VM)
> > > > [0x0000000000000000;0x0200000000000000]
> > > > > Kernel reserved node
> > [0x0000000000000000;0x0000000000000000]
> > > > >
> > > > >  VAs | start              | range              | end                | object             |
> > > > object offset
> > > > > ------------------------------------------------------------------------------
> > -----
> > > > --------------------------
> > > > >      | 0x0000000000000000 | 0x00007ffff5db7000 |
> > > > 0x00007ffff5db7000 | 0x0000000000000000 | 0x0000000000000000
> > > > >      | 0x00007ffff5db7000 | 0x0000000000001000 |
> > > > 0x00007ffff5db8000 | 0x0000000000000000 | 0x00007ffff5db7000
> > > > >      | 0x00007ffff5db8000 | 0x00ff80000a248000 |
> > > > 0x0100000000000000 | 0x0000000000000000 | 0x0000000000000000
> > > > >
> > > >
> > > > This looks useful. We may even want to extend wedged mode to
> > > > switch the
> > > > VM to a state where this is available after a hang which causes a
> > > > wedge
> > > > as this could useful. Can be done in a follow up later.
> > > >
> > > > > Signed-off-by: Oak Zeng <oak.zeng at intel.com>
> > > > > ---
> > > > >  drivers/gpu/drm/xe/xe_debugfs.c | 44
> > > > +++++++++++++++++++++++++++++++++
> > > > >  1 file changed, 44 insertions(+)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/xe/xe_debugfs.c
> > > > b/drivers/gpu/drm/xe/xe_debugfs.c
> > > > > index 1011e5d281fa..0c7bee1c2a8d 100644
> > > > > --- a/drivers/gpu/drm/xe/xe_debugfs.c
> > > > > +++ b/drivers/gpu/drm/xe/xe_debugfs.c
> > > > > @@ -9,6 +9,7 @@
> > > > >  #include <linux/string_helpers.h>
> > > > >
> > > > >  #include <drm/drm_debugfs.h>
> > > > > +#include <drm/drm_drv.h>
> > > > >
> > > > >  #include "xe_bo.h"
> > > > >  #include "xe_device.h"
> > > > > @@ -84,9 +85,52 @@ static int sriov_info(struct seq_file *m,
> > void
> > > > *data)
> > > > >  	return 0;
> > > > >  }
> > > > >
> > > > > +static int show_vm_gpuvas(struct xe_vm *vm, struct seq_file
> > *m)
> > > > > +{
> > > > > +	int ret;
> > > > > +
> > > > > +	mutex_lock(&vm->snap_mutex);
> > > > > +	ret = drm_debugfs_gpuva_info(m, &vm->gpuvm);
> > > > > +	mutex_unlock(&vm->snap_mutex);
> > > > > +
> > > > > +	return ret;
> > > > > +}
> > > > > +
> > > > > +static int show_each_vm(struct seq_file *m, void *arg)
> > > > > +{
> > > > > +	struct drm_info_node *node = (struct drm_info_node *)m-
> > > > >private;
> > > > > +	struct xe_device *xe = node_to_xe(node);
> > > > > +	struct drm_device *dev = &xe->drm;
> > > > > +	struct drm_file *file;
> > > > > +	struct xe_file *xef;
> > > > > +	int (*show)(struct xe_vm *, struct seq_file *) = node-
> > > > >info_ent->data;
> > > > > +	struct xe_vm *vm;
> > > > > +	unsigned long idx;
> > > > > +	int ret = 0;
> > > > > +
> > > > > +	mutex_lock(&dev->filelist_mutex);
> > > > > +	list_for_each_entry(file, &dev->filelist, lhead) {
> > > > > +		xef = (struct xe_file *)file->driver_priv;
> > > > > +		seq_printf(m, "Process %s GPU VA space\n", xef-
> > > > >process_name);
> > > > > +		mutex_lock(&xef->vm.lock);
> > > > > +		xa_for_each(&xef->vm.xa, idx, vm) {
> > > >
> > > > Kinda a nit but I've said this in a few other places - the xef-
> > >vm.lock
> > > > read usage isn't right here. This look protects the xarray lookup
> > and
> > > > ref to VM and nothing else. It is not designed to do anything else -
> > > > e.g. holding the lock then calling another function.
> > >
> > > I don't see a problem calling a function holding xef->vm.lock. we
> > require
> > > Vm to be alive while calling show_vm_gpuvas, and xef->vm.lock
> > guarantees
> > > Vm is alive because it guarantee vm is on the xarray and as long as it
> > is on
> > > The xarray we hold a vm ref.
> > >
> > 
> > Your code works, it is about having well defined locks and not abusing
> > them.
> > 
> > To repeat - this lock protects the lookup and taking a ref. It is not
> > intended to be taken and do other things underneath it, certain not
> > intended to be an outer lock of sorts.
> > 
> > > >
> > > > With that this loop should look like this.
> > > >
> > > > mutex_lock(&xef->vm.lock);
> > > > xa_for_each(&xef->vm.xa, idx, vm) {
> > >
> > > I don't think xa_for_each is safe againt removing items from xarray,
> > > At least this is my understanding.
> > >
> > 
> > It is [1]. The lock protects the ref to VM so after lookup it is not
> > immediately freed it does not protect the lookup itself, xa has locking
> > for that.
> 
> You are right. Now I agree xa_for_each is safe against xarray modification.
> 
> > 
> > [1]
> > https://elixir.bootlin.com/linux/v6.11/source/include/linux/xarray.h#
> > L491
> > 
> > > > 	xe_vm_get(vm);
> > > > 	mutex_unlock(&xef->vm.lock);
> > > >
> > > > 	/* Do something */
> > > >
> > > > 	mutex_lock(&xef->vm.lock);
> > > > 	xe_vm_put(vm);
> > > > }
> > > > mutex_unlock(&xef->vm.lock);
> > > >
> > > > This avoids unintentionally creating locking chains, in this case
> > > > xef->vm.lock -> snap_mutex. We should strive to only create well
> > > > defined
> > > > changes of locking. Once we start unintentionally creating locking
> > > > chains it hard to unwind in a driver.
> > >
> > > Can we annotate the xef->vm.lock -> snap_mutex locking order?
> > > It is reasonable to me from the locking hierarchy here: xef->vm.lock
> > > Is at the xe file level, and snap_mutex is at vm level.  we can create
> > > multiple vms to one xe file, so the locking order seems natural to
> > me.
> > >
> > > the key point is, not be afraid of creating locking chain, instead,
> > creating
> > > locking chain at reasonable order. Fortunately, the locking order in
> > this
> > > case is very obvious.
> > >
> > 
> > We could annotate it but I'd rather just avoid a chain. In general
> > creating chains is a bad idea and great way to have locking which can't
> > be understood or get inversions.
> > 
> > Again this works but it fairly easy to avoid creating this chain so
> > let's not do it.
> 
> If you think this is better, I can change the codes as you indicated. Not a big deal.
> 

That would be for the best.

> > 
> > The only place we really have chains in Xe is in memory management
> > code
> > (VM and dma-resv locks) which absolutely required are very well
> > defined.
> > 
> > > >
> > > > Likewise I suspect we really shouldn't be holding dev-
> > >filelist_mutex
> > > > and doing things underneath it. Fixing that would likely require
> > > > converting &dev->filelist to an xarray or writing an iterator with a
> > > > hitch though so is a bit of a larger change. Looking at the usage of
> > > > this lock, I'd say this is kinda abusing this lock too and I don't think
> > > > anyone in the community would be all that happy about its usage
> > > > here.
> > > > I'd lean towards we need to properly fix this, or alternatively
> > maybe
> > > > just maintain a per device storage of all the VMs.
> > >
> > > Take a look at drm_client_info function. Holding filelist_mutex and
> > doing
> > 
> > I don't see that function [2].
> > 
> > [2] https://elixir.bootlin.com/linux/v6.11/A/ident/drm_client_info`
> > 
> > I see drm_clients_info...
> 
> 
> Sorry it was a typo. I meant drm_clients_info. It is here:
> 
> https://elixir.bootlin.com/linux/v6.11/source/drivers/gpu/drm/drm_debugfs.c#L73
> 
> This function not only takes a master_lookup_lock, but also a rcu_read_lock.
>

That a generic thing for dereferencing RCU protected pointers which is
required to look at the PID I think. The PID is very tied to the file so
it makes sense.

> And it does a lot of things underneath.
> 

Not really, it prints a somethings related to file / pid.

> Same behavior is demonstrated by amdgpu_debugfs_vm_info_show.
> 

Yea, my previous reply cover that - that looks wrong to me.

> They are all correct usage of the lock. Very natural and straight forward
> To me. Not at all an abusive.
> 
> So why you think differently? By carefully read your words, I found there are
> Very subtle different understanding of locks.
> 
> " This look protects the xarray lookup " - I understand it differently. I think the lock
> Protect xarray itself, not lookup. Any modification or walking of the xarray need
> To hold this lock.

No, I cover what the intent of the lock is in my previous reply. FWIW
this is third time I've said this to people in the group. I likely
should update the documentation to avoid confusion here.

> 
> Those locks protect variables or data structure, not action.
> 
> So in my original codes, by using those locks in my original way, I basically say:
> I am walking all the files in a device and all the vms in a file, please nobody
> Is allowed to modify file list and vm xarray.
> 

I understand what the code is doing and it does work. Again the point is
this lock is not designed to have things done underneath it. Your code
works and yes it is very small issue but if we start doing things that
'work' all over the driver for convenience we get an unmaintainable mess
eventually. We should strive for doing what is correct over convenience.

> (Of course you could argue there could be efficiency issue because you hold
> Lock for long time. But this is just debugfs where you usually don't care much
> About the efficiency. Of course this is also not the argument here.)
> 

We don't care about efficiency here.

> I think we should first agree on this point. So I am not going to reply to
> What you said below.
>

My position is quite clear on this and as stated this is 3rd time I have
pushed back on usage of locks like this. I purpose another viable
solution without any new locking chains or reaching into another layer's
locking, I would prefer if went with that.

Matt

> Oak
> 
> > 
> > That function does take fpriv->master_lookup_lock in
> > drm_is_current_master but that IMO is well defined and within the
> > DRM
> > layer too. We also shouldn't take a lock from different layer and make
> > rules of what we can with it.
> > 
> > I guess AMD and vmwgfx [3] take this lock in kinda similar ways but
> > IMO
> > that is a bit abusive, in particular this usage [4].
> > 
> > [3] https://elixir.bootlin.com/linux/v6.11/A/ident/filelist_mutex
> > [4]
> > https://elixir.bootlin.com/linux/v6.11/source/drivers/gpu/drm/amd/
> > amdgpu/amdgpu_debugfs.c#L1771
> > 
> > > Things underneath, including hold another lock, seems fine. That
> > codes have
> > > Been there for many years without a problem. It is just natural to
> > hold
> > 
> > Again I think it works but that is not my point.
> > 
> > > Filelist_mutex to walk device's file list as this lock is designed to
> > protect the
> > > File list. I don't think it is a abusing of this lock.
> > >
> > > So I think it is reasonable to annotate a dev filelist_mutext->xef
> > vm.lock->vm snap_mutex
> > > Locking order. This is the same order of creating device ->device file
> > descriptor for device->gpuvm
> > > For a device file descriptor. Any reversed order locking is not
> > reasonable thus a kernel driver
> > > Bug.
> > >
> > 
> > I disagree here and think it easy to have a well defined xarray + lock
> > of VMs attached to the device with same semantics as xef->vm.lock.
> > 
> > Now that I'm looking, we already have that xe_device.usm.asid*.
> > Maybe
> > just repurpose that structure / lock to be non-usm specific (e.g.
> > always
> > assign a VM a ASID and insert into it). Your loop here then would look
> > like the on described in my previous reply with taking a ref to the VM
> > and then dropping the lock for main part of the implementation. That
> > seems better.
> > 
> > Matt
> > 
> > > Oak
> > >
> > > >
> > > > Matt
> > > >
> > > > > +			ret = show(vm, m);
> > > > > +			if (ret < 0)
> > > > > +				break;
> > > > > +
> > > > > +			seq_puts(m, "\n");
> > > > > +		}
> > > > > +		mutex_unlock(&xef->vm.lock);
> > > > > +	}
> > > > > +	mutex_unlock(&dev->filelist_mutex);
> > > > > +
> > > > > +	return ret;
> > > > > +}
> > > > > +
> > > > >  static const struct drm_info_list debugfs_list[] = {
> > > > >  	{"info", info, 0},
> > > > >  	{ .name = "sriov_info", .show = sriov_info, },
> > > > > +	DRM_DEBUGFS_GPUVA_INFO(show_each_vm,
> > > > show_vm_gpuvas),
> > > > >  };
> > > > >
> > > > >  static int forcewake_open(struct inode *inode, struct file *file)
> > > > > --
> > > > > 2.26.3
> > > > >