[PATCH] drm/xe: Return -ENOBUFS if a kmalloc fails which is tied to an array of binds

Matthew Brost matthew.brost at intel.com
Sat Jul 20 23:20:37 UTC 2024


On Sat, Jul 20, 2024 at 11:14:36PM +0000, Matthew Brost wrote:
> On Sun, Jul 21, 2024 at 12:34:27AM +0530, Ghimiray, Himal Prasad wrote:
> > 
> > 
> > On 19-07-2024 22:53, Matthew Brost wrote:
> > > The size of an array of binds is directly tied to several kmalloc in the
> > > KMD, thus making these kmalloc more likely to fail. Return -ENOBUFS in
> > > the case of these failures.
> > > 
> > > The expected UMD behavior upon returning -ENOBUFS is to split an array
> > > of binds into a series of single binds.
> > 
> > Would it be appropriate to have some doc/guidelines in the form of drm_err
> > or kernel doc regarding expected behavior from UMD if the ioctl returns a
> > -ENOBUFS error ?
> > 
> 
> Yes, this on the todo list as part of error handling cleanup for both
> exec and bind IOCTLs. I think kernel doc should go in xe_drm.h with a
> list of errno returned and expected UMD actions. Eventually I'd like to
> get this in place for all IOCTLs. I was going to work on getting exec
> and bind IOCTLs fixed up in the next couple of weeks (we have an
> internal doc of required changes) to have it ready for when Thomas is
> back (2 more weeks).
> 
> I made this change as we already have -ENOBUFS implemented in a
> different failure point for array of binds (BB being to large, see
> xe_migrate.c) so might as well just finish up this error code to make it

Also see xe_sa.c, search for -ENOBUFS.

> complete as it is fairly simple change. Also Mesa has a MR [1] inflight
> to handle -ENOBUFs situations.
> 
> Matt
> 
> [1] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30276 
> 
> > > 
> > > Cc: Paulo Zanoni <paulo.r.zanoni at intel.com>
> > > Signed-off-by: Matthew Brost <matthew.brost at intel.com>
> > > ---
> > >   drivers/gpu/drm/xe/xe_vm.c | 12 ++++++------
> > >   1 file changed, 6 insertions(+), 6 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
> > > index 3fde2c8292ad..b715883f40d8 100644
> > > --- a/drivers/gpu/drm/xe/xe_vm.c
> > > +++ b/drivers/gpu/drm/xe/xe_vm.c
> > > @@ -718,7 +718,7 @@ int xe_vm_userptr_check_repin(struct xe_vm *vm)
> > >   		list_empty_careful(&vm->userptr.invalidated)) ? 0 : -EAGAIN;
> > >   }
> > > -static int xe_vma_ops_alloc(struct xe_vma_ops *vops)
> > > +static int xe_vma_ops_alloc(struct xe_vma_ops *vops, bool array_of_binds)
> > >   {
> > >   	int i;
> > > @@ -731,7 +731,7 @@ static int xe_vma_ops_alloc(struct xe_vma_ops *vops)
> > >   				      sizeof(*vops->pt_update_ops[i].ops),
> > >   				      GFP_KERNEL);
> > >   		if (!vops->pt_update_ops[i].ops)
> > > -			return -ENOMEM;
> > > +			return array_of_binds ? -ENOBUFS : -ENOMEM;
> > >   	}
> > >   	return 0;
> > > @@ -824,7 +824,7 @@ int xe_vm_rebind(struct xe_vm *vm, bool rebind_worker)
> > >   			goto free_ops;
> > >   	}
> > > -	err = xe_vma_ops_alloc(&vops);
> > > +	err = xe_vma_ops_alloc(&vops, false);
> > >   	if (err)
> > >   		goto free_ops;
> > > @@ -871,7 +871,7 @@ struct dma_fence *xe_vma_rebind(struct xe_vm *vm, struct xe_vma *vma, u8 tile_ma
> > >   	if (err)
> > >   		return ERR_PTR(err);
> > > -	err = xe_vma_ops_alloc(&vops);
> > > +	err = xe_vma_ops_alloc(&vops, false);
> > >   	if (err) {
> > >   		fence = ERR_PTR(err);
> > >   		goto free_ops;
> > > @@ -2765,7 +2765,7 @@ static int vm_bind_ioctl_check_args(struct xe_device *xe,
> > >   					   sizeof(struct drm_xe_vm_bind_op),
> > >   					   GFP_KERNEL | __GFP_ACCOUNT);
> > >   		if (!*bind_ops)
> > > -			return -ENOMEM;
> > > +			return args->num_binds > 1 ? -ENOBUFS : -ENOMEM;
> > >   		err = __copy_from_user(*bind_ops, bind_user,
> > >   				       sizeof(struct drm_xe_vm_bind_op) *
> > > @@ -3104,7 +3104,7 @@ int xe_vm_bind_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
> > >   		goto unwind_ops;
> > >   	}
> > > -	err = xe_vma_ops_alloc(&vops);
> > > +	err = xe_vma_ops_alloc(&vops, args->num_binds > 1);
> > >   	if (err)
> > >   		goto unwind_ops;


More information about the Intel-xe mailing list