[PATCH 19/21] drm/xe/eudebug: Implement vm_bind_op discovery
Matthew Brost
matthew.brost at intel.com
Sat Jul 27 04:39:46 UTC 2024
On Fri, Jul 26, 2024 at 05:08:16PM +0300, Mika Kuoppala wrote:
> Follow the vm bind, vm_bind op sequence for
> discovery process of a vm with the vmas it has.
> Send events for ops and attach metadata if available.
>
> v2: Fix bad op ref seqno (Christoph)
>
> Signed-off-by: Mika Kuoppala <mika.kuoppala at linux.intel.com>
> ---
> drivers/gpu/drm/xe/xe_eudebug.c | 40 +++++++++++++++++++++++++++++++++
> 1 file changed, 40 insertions(+)
>
> diff --git a/drivers/gpu/drm/xe/xe_eudebug.c b/drivers/gpu/drm/xe/xe_eudebug.c
> index f0bec8c8bba1..a80ffb64df15 100644
> --- a/drivers/gpu/drm/xe/xe_eudebug.c
> +++ b/drivers/gpu/drm/xe/xe_eudebug.c
> @@ -3070,6 +3070,42 @@ void xe_eudebug_debug_metadata_destroy(struct xe_file *xef, struct xe_debug_meta
> xe_eudebug_event_put(d, debug_metadata_destroy_event(d, xef, m));
> }
>
> +static int vm_discover_binds(struct xe_eudebug *d, struct xe_vm *vm)
> +{
> + struct drm_gpuva *va;
> + unsigned int num_ops = 0, send_ops = 0;
> + u64 ref_seqno = 0;
> + int err;
> +
> + /* Currently only vm_bind_ioctl inserts vma's */
> + drm_gpuvm_for_each_va(va, &vm->gpuvm)
This is not safe without at least vm->lock in read mode.
This function is called with xef->vm.lock & xe->files.lock and we likely
don't want to create locking dependency chain between these locks and
vm->lock either.
Hmm... How about when discovery_work_fn is called wait for executing
IOCTLs to complete and pause new IOCTLs until it is done. We should be
able to hook in xe_drm_ioctl to do this. Then we wouldn't need
xef->vm.lock & xe->files.lock locks either.
Also to be clear xef->exec_queue.lock & xef->vm.lock are not designed to
be held and do anything underneath expect to lookup something in the
xarray + take a ref to the object. Any other use is abusing these locks
and asking for trouble.
e.g. we do things like this.
241 mutex_lock(&xef->exec_queue.lock);
242 q = xa_load(&xef->exec_queue.xa, id);
243 if (q)
244 xe_exec_queue_get(q);
245 mutex_unlock(&xef->exec_queue.lock);
If you need use xa_for_each I'd expect to look like this:
mutex_lock(&xef->exec_queue.lock);
xa_for_each(&xef->exec_queue.xa, i, q) {
xe_exec_queue_get(q);
mutex_unlock(&xef->exec_queue.lock);
/* Do something with exec queue */
xe_exec_queue_put(q);
mutex_lock(&xef->exec_queue.lock);
}
mutex_unlock(&xef->exec_queue.lock);
It is prefectly fine in the above loop for exec queue to be removed from
the xa, xa_for_each is safe for that. The lock protects the lookup and
reference, nothing more.
Does this make sense? Feel free to ping me directly to discuss further.
Matt
> + num_ops++;
> +
> + if (!num_ops)
> + return 0;
> +
> + err = vm_bind_event(d, vm, num_ops, &ref_seqno);
> + if (err)
> + return err;
> +
> + drm_gpuvm_for_each_va(va, &vm->gpuvm) {
> + struct xe_vma *vma = container_of(va, struct xe_vma, gpuva);
> +
> + if (send_ops >= num_ops)
> + break;
> +
> + err = vm_bind_op(d, vm, DRM_XE_EUDEBUG_EVENT_CREATE, ref_seqno,
> + xe_vma_start(vma), xe_vma_size(vma),
> + &vma->debug_metadata);
> + if (err)
> + return err;
> +
> + send_ops++;
> + }
> +
> + return num_ops == send_ops ? 0 : -EINVAL;
> +}
> +
> static int discover_client(struct xe_eudebug *d, struct xe_file *xef)
> {
> struct xe_debug_metadata *m;
> @@ -3095,6 +3131,10 @@ static int discover_client(struct xe_eudebug *d, struct xe_file *xef)
> err = vm_create_event(d, xef, vm);
> if (err)
> break;
> +
> + err = vm_discover_binds(d, vm);
> + if (err)
> + break;
> }
> mutex_unlock(&xef->vm.lock);
>
> --
> 2.34.1
>
More information about the Intel-xe
mailing list