[PATCH] drm/xe: Fix UBSAN splat in add_preempt_fences()

Lucas De Marchi lucas.demarchi at intel.com
Fri Dec 15 23:04:26 UTC 2023


On Fri, Dec 15, 2023 at 10:55:36PM +0000, Matthew Brost wrote:
>On Fri, Dec 15, 2023 at 04:30:01PM -0500, Rodrigo Vivi wrote:
>> On Fri, Dec 15, 2023 at 01:12:45PM -0800, Matthew Brost wrote:
>> > add_preempt_fences() calls dma_resv_reserve_fences() with num_fences ==
>> > 0 resulting in the below UBSAN splat. Short circuit add_preempt_fences()
>> > if num_fences == 0.
>> >
>> > [   58.652241] ================================================================================
>> > [   58.660736] UBSAN: shift-out-of-bounds in ./include/linux/log2.h:57:13
>> > [   58.667281] shift exponent 64 is too large for 64-bit type 'long unsigned int'
>> > [   58.674539] CPU: 2 PID: 1170 Comm: xe_gpgpu_fill Not tainted 6.6.0-rc3-guc+ #630
>> > [   58.674545] Hardware name: Intel Corporation Tiger Lake Client Platform/TigerLake U DDR4 SODIMM RVP, BIOS TGLSFWI1.R00.3243.A01.2006102133 06/10/2020
>> > [   58.674547] Call Trace:
>> > [   58.674548]  <TASK>
>> > [   58.674550]  dump_stack_lvl+0x92/0xb0
>> > [   58.674555]  __ubsan_handle_shift_out_of_bounds+0x15a/0x300
>> > [   58.674559]  ? rcu_is_watching+0x12/0x60
>> > [   58.674564]  ? software_resume+0x141/0x210
>> > [   58.674575]  ? new_vma+0x44b/0x600 [xe]
>> > [   58.674606]  dma_resv_reserve_fences.cold+0x40/0x66
>> > [   58.674612]  new_vma+0x4b3/0x600 [xe]
>> > [   58.674638]  xe_vm_bind_ioctl+0xffd/0x1e00 [xe]
>> > [   58.674663]  ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe]
>> > [   58.674680]  drm_ioctl_kernel+0xc1/0x170
>> > [   58.674686]  ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe]
>> > [   58.674703]  drm_ioctl+0x247/0x4c0
>> > [   58.674709]  ? find_held_lock+0x2b/0x80
>> > [   58.674716]  __x64_sys_ioctl+0x8c/0xb0
>> > [   58.674720]  do_syscall_64+0x3c/0x90
>> > [   58.674723]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
>> > [   58.674727] RIP: 0033:0x7fce4bd1aaff
>> > [   58.674730] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <41> 89 c0 3d 00 f0 ff ff 77 1f 48 8b 44 24 18 64 48 2b 04 25 28 00
>> > [   58.674731] RSP: 002b:00007ffc57434050 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
>> > [   58.674734] RAX: ffffffffffffffda RBX: 00007ffc574340e0 RCX: 00007fce4bd1aaff
>> > [   58.674736] RDX: 00007ffc574340e0 RSI: 0000000040886445 RDI: 0000000000000003
>> > [   58.674737] RBP: 0000000040886445 R08: 0000000000000002 R09: 00007ffc574341b0
>> > [   58.674739] R10: 000055de43eb3780 R11: 0000000000000246 R12: 00007ffc574340e0
>> > [   58.674740] R13: 0000000000000003 R14: 00007ffc574341b0 R15: 0000000000000001
>> > [   58.674747]  </TASK>
>> > [   58.674748] ================================================================================
>>
>> Can we pin-point which commit introduced this?
>> we need to start using more the 'Fixes:' tags.
>> (Although while we are really not in tree we still have a chance of rebasing and having to adjust
>> the tags once again)
>>
>
>Yes, so add this?
>
>Fixes: 099192666a18 ("drm/xe: Introduce a new DRM driver for Intel GPUs")

Sorry I missed saying why I reproduced this... we are trying to change
the kernel config we use in our CI to align better with what distros
set + a bunch of debugs that are useful for catching bugs.

https://gitlab.freedesktop.org/drm/xe/ci/-/merge_requests/31

I merged that, but then decided to revert since we were at the brink of
switching to drm-next for our first pull request. I didn't want to
create noise thinking new issues were due to the baseline change, so
it's now reverted. But at least we now have some logs with additional
issues we can try to squash.

Being 099192666a18, I don't think we really need it.

thanks
Lucas De Marchi

>
>> >
>> > Cc: Lucas De Marchi <lucas.demarchi at intel.com>
>> > Signed-off-by: Matthew Brost <matthew.brost at intel.com>
>> > ---
>> >  drivers/gpu/drm/xe/xe_vm.c | 3 +++
>> >  1 file changed, 3 insertions(+)
>> >
>> > diff --git a/drivers/gpu/drm/xe/xe_vm.c b/drivers/gpu/drm/xe/xe_vm.c
>> > index 322c1eccecca..6406370b2625 100644
>> > --- a/drivers/gpu/drm/xe/xe_vm.c
>> > +++ b/drivers/gpu/drm/xe/xe_vm.c
>> > @@ -283,6 +283,9 @@ static int add_preempt_fences(struct xe_vm *vm, struct xe_bo *bo)
>> >  	if (err)
>> >  		return err;
>> >
>> > +	if (!vm->preempt.num_exec_queues)
>> > +		goto out_unlock;
>> > +
>> >  	err = dma_resv_reserve_fences(bo->ttm.base.resv, vm->preempt.num_exec_queues);
>> >  	if (err)
>> >  		goto out_unlock;
>> > --
>> > 2.34.1
>> >


More information about the Intel-xe mailing list