[PATCH] drm/xe: Fix UBSAN shift-out-of-bounds failure

Christian König christian.koenig at amd.com
Tue May 7 13:23:10 UTC 2024


Am 07.05.24 um 15:18 schrieb Lucas De Marchi:
>
> +Thomas, +Christian, +dri-devel
>
> On Tue, May 07, 2024 at 11:42:46AM GMT, Nirmoy Das wrote:
>>
>> On 5/7/2024 11:39 AM, Nirmoy Das wrote:
>>>
>>>
>>> On 5/7/2024 10:04 AM, Shuicheng Lin wrote:
>>>> Here is the failure stack:
>>>> [   12.988209] ------------[ cut here ]------------
>>>> [   12.988216] UBSAN: shift-out-of-bounds in 
>>>> ./include/linux/log2.h:57:13
>>>> [   12.988232] shift exponent 64 is too large for 64-bit type 'long 
>>>> unsigned int'
>>>> [   12.988235] CPU: 4 PID: 1310 Comm: gnome-shell Tainted: G     
>>>> U             6.9.0-rc6+prerelease1158+ #19
>>>> [   12.988237] Hardware name: Intel Corporation Raptor Lake Client 
>>>> Platform/RPL-S ADP-S DDR5 UDIMM CRB, BIOS 
>>>> RPLSFWI1.R00.3301.A02.2208050712 08/05/2022
>>>> [   12.988239] Call Trace:
>>>> [   12.988240]  <TASK>
>>>> [   12.988242]  dump_stack_lvl+0xd7/0xf0
>>>> [   12.988248]  dump_stack+0x10/0x20
>>>> [   12.988250]  ubsan_epilogue+0x9/0x40
>>>> [   12.988253] __ubsan_handle_shift_out_of_bounds+0x10e/0x170
>>>> [   12.988260]  dma_resv_reserve_fences.cold+0x2b/0x48
>>>> [   12.988262]  ? ww_mutex_lock_interruptible+0x3c/0x110
>>>> [   12.988267]  drm_exec_prepare_obj+0x45/0x60 [drm_exec]
>>>> [   12.988271]  ? vm_bind_ioctl_ops_execute+0x5b/0x740 [xe]
>>>> [   12.988345]  vm_bind_ioctl_ops_execute+0x78/0x740 [xe]
>>>>
>>>> It is caused by the value 0 of parameter num_fences in function 
>>>> drm_exec_prepare_obj.
>>>> And lead to in function __rounddown_pow_of_two, "0 - 1" causes the 
>>>> shift-out-of-bounds.
>>>> For the num_fences, it should be 1 at least.
>>>>
>>>> Cc: Matthew Brost<matthew.brost at intel.com>
>>>> Signed-off-by: Shuicheng Lin<shuicheng.lin at intel.com>
>>>> ---
>>>>  drivers/gpu/drm/xe/xe_vm.c | 4 ++--
>>>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
>>>> index d17192c8b7de..96cb4d9762a3 100644
>>>> --- a/drivers/gpu/drm/xe/xe_vm.c
>>>> +++ b/drivers/gpu/drm/xe/xe_vm.c
>>>> @@ -2692,7 +2692,7 @@ static int vma_lock_and_validate(struct 
>>>> drm_exec *exec, struct xe_vma *vma,
>>>>      if (bo) {
>>>>          if (!bo->vm)
>>>> -            err = drm_exec_prepare_obj(exec, &bo->ttm.base, 0);
>>>> +            err = drm_exec_prepare_obj(exec, &bo->ttm.base, 1);
>>>
>>> This needs to be fixed in drm_exec_prepare_obj() by checking 
>>> num_fences and not calling dma_resv_reserve_fences()
>>>
>> or just call drm_exec_lock_obj() here. ref: 
>> https://patchwork.freedesktop.org/patch/577487/
>
> we are hit again by this. Couldn't we change drm_exec_prepare_obj() to
> check num_fences and if is 0 just fallback to just do
> drm_exec_lock_obj() as  "the least amount of work needed in this case"?

No, and that reminds me (again!) that I wanted to add a WARN_ON for this.

If you don't need a fence slot in the first place then you should only 
use drm_exec_lock_obj() instead of drm_exec_prepare_obj().

If you dynamically calculate the number of fence slots needed and end up 
with zero then there is most likely something wrong with your calculation.

That was intentionally made like this because we ended up with quite 
some bugs around that.

Regards,
Christian.

>
> Something like this:
>
> | diff --git a/drivers/gpu/drm/drm_exec.c b/drivers/gpu/drm/drm_exec.c
> | index 2da094bdf8a4..68b5f6210b09 100644
> | --- a/drivers/gpu/drm/drm_exec.c
> | +++ b/drivers/gpu/drm/drm_exec.c
> | @@ -296,10 +296,12 @@ int drm_exec_prepare_obj(struct drm_exec 
> *exec, struct drm_gem_object *obj,
> |      if (ret)
> |          return ret;
> |  | -    ret = dma_resv_reserve_fences(obj->resv, num_fences);
> | -    if (ret) {
> | -        drm_exec_unlock_obj(exec, obj);
> | -        return ret;
> | +    if (num_fences) {
> | +        ret = dma_resv_reserve_fences(obj->resv, num_fences);
> | +        if (ret) {
> | +            drm_exec_unlock_obj(exec, obj);
> | +            return ret;
> | +        }
> |      }
> |  |      return 0;
>
> thanks
> Lucas De Marchi
>
>>
>> Nirmoy
>>
>>>
>>> Regards,
>>>
>>> Nirmoy
>>>
>>>>          if (!err && validate)
>>>>              err = xe_bo_validate(bo, xe_vma_vm(vma), true);
>>>>      }
>>>> @@ -2777,7 +2777,7 @@ static int 
>>>> vm_bind_ioctl_ops_lock_and_prep(struct drm_exec *exec,
>>>>      struct xe_vma_op *op;
>>>>      int err;
>>>> -    err = drm_exec_prepare_obj(exec, xe_vm_obj(vm), 0);
>>>> +    err = drm_exec_prepare_obj(exec, xe_vm_obj(vm), 1);
>>>>      if (err)
>>>>          return err;



More information about the dri-devel mailing list