[PATCH] drm/xe: Return -ENOBUFS if a kmalloc fails which is tied to an array of binds

Ghimiray, Himal Prasad himal.prasad.ghimiray at intel.com
Mon Jul 22 03:53:13 UTC 2024



On 21-07-2024 04:50, Matthew Brost wrote:
> On Sat, Jul 20, 2024 at 11:14:36PM +0000, Matthew Brost wrote:
>> On Sun, Jul 21, 2024 at 12:34:27AM +0530, Ghimiray, Himal Prasad wrote:
>>>
>>>
>>> On 19-07-2024 22:53, Matthew Brost wrote:
>>>> The size of an array of binds is directly tied to several kmalloc in the
>>>> KMD, thus making these kmalloc more likely to fail. Return -ENOBUFS in
>>>> the case of these failures.
>>>>
>>>> The expected UMD behavior upon returning -ENOBUFS is to split an array
>>>> of binds into a series of single binds.
>>>
>>> Would it be appropriate to have some doc/guidelines in the form of drm_err
>>> or kernel doc regarding expected behavior from UMD if the ioctl returns a
>>> -ENOBUFS error ?
>>>
>>
>> Yes, this on the todo list as part of error handling cleanup for both
>> exec and bind IOCTLs. I think kernel doc should go in xe_drm.h with a
>> list of errno returned and expected UMD actions. Eventually I'd like to
>> get this in place for all IOCTLs. I was going to work on getting exec
>> and bind IOCTLs fixed up in the next couple of weeks (we have an
>> internal doc of required changes) to have it ready for when Thomas is
>> back (2 more weeks).

Sure, that sounds great!

>>
>> I made this change as we already have -ENOBUFS implemented in a
>> different failure point for array of binds (BB being to large, see
>> xe_migrate.c) so might as well just finish up this error code to make it
> 
> Also see xe_sa.c, search for -ENOBUFS.
> 
>> complete as it is fairly simple change. Also Mesa has a MR [1] inflight
>> to handle -ENOBUFs situations.

Makes sense.

Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray at intel.com>

>>
>> Matt
>>
>> [1] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30276
>>
>>>>
>>>> Cc: Paulo Zanoni <paulo.r.zanoni at intel.com>
>>>> Signed-off-by: Matthew Brost <matthew.brost at intel.com>
>>>> ---
>>>>    drivers/gpu/drm/xe/xe_vm.c | 12 ++++++------
>>>>    1 file changed, 6 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
>>>> index 3fde2c8292ad..b715883f40d8 100644
>>>> --- a/drivers/gpu/drm/xe/xe_vm.c
>>>> +++ b/drivers/gpu/drm/xe/xe_vm.c
>>>> @@ -718,7 +718,7 @@ int xe_vm_userptr_check_repin(struct xe_vm *vm)
>>>>    		list_empty_careful(&vm->userptr.invalidated)) ? 0 : -EAGAIN;
>>>>    }
>>>> -static int xe_vma_ops_alloc(struct xe_vma_ops *vops)
>>>> +static int xe_vma_ops_alloc(struct xe_vma_ops *vops, bool array_of_binds)
>>>>    {
>>>>    	int i;
>>>> @@ -731,7 +731,7 @@ static int xe_vma_ops_alloc(struct xe_vma_ops *vops)
>>>>    				      sizeof(*vops->pt_update_ops[i].ops),
>>>>    				      GFP_KERNEL);
>>>>    		if (!vops->pt_update_ops[i].ops)
>>>> -			return -ENOMEM;
>>>> +			return array_of_binds ? -ENOBUFS : -ENOMEM;
>>>>    	}
>>>>    	return 0;
>>>> @@ -824,7 +824,7 @@ int xe_vm_rebind(struct xe_vm *vm, bool rebind_worker)
>>>>    			goto free_ops;
>>>>    	}
>>>> -	err = xe_vma_ops_alloc(&vops);
>>>> +	err = xe_vma_ops_alloc(&vops, false);
>>>>    	if (err)
>>>>    		goto free_ops;
>>>> @@ -871,7 +871,7 @@ struct dma_fence *xe_vma_rebind(struct xe_vm *vm, struct xe_vma *vma, u8 tile_ma
>>>>    	if (err)
>>>>    		return ERR_PTR(err);
>>>> -	err = xe_vma_ops_alloc(&vops);
>>>> +	err = xe_vma_ops_alloc(&vops, false);
>>>>    	if (err) {
>>>>    		fence = ERR_PTR(err);
>>>>    		goto free_ops;
>>>> @@ -2765,7 +2765,7 @@ static int vm_bind_ioctl_check_args(struct xe_device *xe,
>>>>    					   sizeof(struct drm_xe_vm_bind_op),
>>>>    					   GFP_KERNEL | __GFP_ACCOUNT);
>>>>    		if (!*bind_ops)
>>>> -			return -ENOMEM;
>>>> +			return args->num_binds > 1 ? -ENOBUFS : -ENOMEM;
>>>>    		err = __copy_from_user(*bind_ops, bind_user,
>>>>    				       sizeof(struct drm_xe_vm_bind_op) *
>>>> @@ -3104,7 +3104,7 @@ int xe_vm_bind_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
>>>>    		goto unwind_ops;
>>>>    	}
>>>> -	err = xe_vma_ops_alloc(&vops);
>>>> +	err = xe_vma_ops_alloc(&vops, args->num_binds > 1);
>>>>    	if (err)
>>>>    		goto unwind_ops;


More information about the Intel-xe mailing list