[PATCH] drm/xe: Make sure scheduler is ready before submission

Sat Feb 22 07:17:51 UTC 2025

> -----Original Message-----
> From: Brost, Matthew <matthew.brost at intel.com>
> Sent: Friday, February 21, 2025 10:54 PM
> To: Upadhyay, Tejas <tejas.upadhyay at intel.com>
> Cc: intel-xe at lists.freedesktop.org; Ghimiray, Himal Prasad
> <himal.prasad.ghimiray at intel.com>; Nilawar, Badal
> <badal.nilawar at intel.com>
> Subject: Re: [PATCH] drm/xe: Make sure scheduler is ready before submission
> 
> On Thu, Feb 20, 2025 at 06:08:59PM +0530, Tejas Upadhyay wrote:
> > Async call to __guc_exec_queue_fini_async frees scheduler while some
> > scheduler submission would have already started.
> > To handle such small window race case, submission should only start
> > when scheduler is ready.
> >
> > It will help to solve below which is not easily reproducible,
> > https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4223
> >
> 
> I don't think this help. We can only enter xe_sched_submission_start if exec
> queue is in &guc->submission_state.exec_queue_lookup and remove the exec
> queue under a mutex in release_guc_id before we call xe_sched_fini which flips
> the ready state to false. We then immediately free the scheduler - we'd be
> getting a UAF in xe_sched_submission_start anyways.
> 
> The gitlab issue looks like a memory corruption. Basically it is saying a variable
> which is a bool is reading 107 which shouldn't be possible.

It gets issue, while accessing drm_sched_rq *rq, look at below,

static struct drm_sched_entity *
drm_sched_rq_select_entity_fifo(struct drm_gpu_scheduler *sched,
                                struct drm_sched_rq *rq)
{
        struct rb_node *rb;

        spin_lock(&rq->lock); --> here it crashes

Async free of sched_rq happens at,
 void xe_sched_fini(struct xe_gpu_scheduler *sched)
{
        xe_sched_submission_stop(sched);
        drm_sched_fini(&sched->base); --> here it frees sched_rq and marks sched_ready = false)
}

It looks like inbetween drm_sched_fini and xe_exec_queue_fini we are hitting this issue, which might be avoided with check mentioned here in patch. I saw similar check being done but other drivers look below in drivers/gpu/drm/amd/amdgpu/amdgpu_job.c,
                        if (amdgpu_ring_sched_ready(ring))
                                drm_sched_start(&ring->sched, 0);

Dump stack details :
[ 1481.378550] ==================================================================
[ 1481.385791] BUG: KASAN: slab-use-after-free in __lock_acquire (kernel/locking/lockdep.c:5089)
[ 1481.392598] Read of size 8 at addr ffff88812352a7a0 by task kworker/u128:11/2827

[ 1481.401506] CPU: 17 UID: 0 PID: 2827 Comm: kworker/u128:11 Not tainted 6.13.0xe-ptl+ #1
[ 1481.409475] Hardware name: Intel Corporation Raptor Lake Client Platform/RPL-S ADP-S DDR5 UDIMM CRB, BIOS RPLSFWI1.R00.5045.A00.2401260733 01/26/2024
[ 1481.422824] Workqueue:  drm_sched_run_job_work [gpu_sched]
[ 1481.428348] Call Trace:
[ 1481.430803]  <TASK>
[ 1481.432913] dump_stack_lvl (lib/dump_stack.c:123)
[ 1481.436589] print_report (mm/kasan/report.c:379 mm/kasan/report.c:489)
[ 1481.440206] ? __virt_addr_valid (./arch/x86/include/asm/preempt.h:94 ./include/linux/rcupdate.h:964 ./include/linux/mmzone.h:2058 arch/x86/mm/physaddr.c:65)
[ 1481.444511] ? __lock_acquire (kernel/locking/lockdep.c:5089)
[ 1481.448527] kasan_report (mm/kasan/report.c:604)
[ 1481.452141] ? __lock_acquire (kernel/locking/lockdep.c:5089)
[ 1481.456181] __lock_acquire (kernel/locking/lockdep.c:5089)
[ 1481.460030] ? check_chain_key (kernel/locking/lockdep.c:3967)
[ 1481.464137] ? __lock_acquire (kernel/locking/lockdep.c:5240)
[ 1481.468240] ? __pfx___lock_acquire (kernel/locking/lockdep.c:5077)
[ 1481.472604] lock_acquire (kernel/locking/lockdep.c:467 kernel/locking/lockdep.c:5851 kernel/locking/lockdep.c:5814)
[ 1481.476331] ? drm_sched_run_job_work (drivers/gpu/drm/scheduler/sched_main.c:344 drivers/gpu/drm/scheduler/sched_main.c:1068 drivers/gpu/drm/scheduler/sched_main.c:1199) gpu_sched
[ 1481.482099] ? __pfx_lock_acquire (kernel/locking/lockdep.c:5817)
[ 1481.486292] ? lock_acquire (kernel/locking/lockdep.c:467 kernel/locking/lockdep.c:5851 kernel/locking/lockdep.c:5814)
[ 1481.490143] ? process_one_work (kernel/workqueue.c:3212)
[ 1481.494360] _raw_spin_lock (./include/linux/spinlock_api_smp.h:134 kernel/locking/spinlock.c:154)
[ 1481.498061] ? drm_sched_run_job_work (drivers/gpu/drm/scheduler/sched_main.c:344 drivers/gpu/drm/scheduler/sched_main.c:1068 drivers/gpu/drm/scheduler/sched_main.c:1199) gpu_sched
[ 1481.503825] drm_sched_run_job_work (drivers/gpu/drm/scheduler/sched_main.c:344 drivers/gpu/drm/scheduler/sched_main.c:1068 drivers/gpu/drm/scheduler/sched_main.c:1199) gpu_sched
[ 1481.509428] process_one_work (kernel/workqueue.c:3236)
[ 1481.513477] ? __pfx_lock_acquire (kernel/locking/lockdep.c:5817)
[ 1481.517693] ? __pfx_process_one_work (kernel/workqueue.c:3138)
[ 1481.522275] ? __list_add_valid_or_report (lib/list_debug.c:32)
[ 1481.527155] worker_thread (kernel/workqueue.c:3311 kernel/workqueue.c:3398)
[ 1481.530966] ? __pfx_worker_thread (kernel/workqueue.c:3344)
[ 1481.535243] kthread (kernel/kthread.c:389)
[ 1481.538510] ? kthread (kernel/kthread.c:374)
[ 1481.541824] ? __pfx_kthread (kernel/kthread.c:342)
[ 1481.545569] ret_from_fork (arch/x86/kernel/process.c:147)
[ 1481.549203] ? __pfx_kthread (kernel/kthread.c:342)
[ 1481.552986] ret_from_fork_asm (arch/x86/entry/entry_64.S:257)
[ 1481.556918]  </TASK>

[ 1481.560615] Allocated by task 8713:
[ 1481.564099] kasan_save_stack (mm/kasan/common.c:48)
[ 1481.564104] kasan_save_track (./arch/x86/include/asm/current.h:49 mm/kasan/common.c:60 mm/kasan/common.c:69)
[ 1481.564108] __kasan_kmalloc (mm/kasan/common.c:398)
[ 1481.564112] drm_sched_init (drivers/gpu/drm/scheduler/sched_main.c:1314) gpu_sched
[ 1481.564121] guc_exec_queue_init (drivers/gpu/drm/xe/xe_guc_submit.c:1445 (discriminator 6)) xe
[ 1481.564349] xe_exec_queue_create (drivers/gpu/drm/xe/xe_exec_queue.c:152 drivers/gpu/drm/xe/xe_exec_queue.c:183) xe
[ 1481.564563] xe_exec_queue_create_ioctl (drivers/gpu/drm/xe/xe_exec_queue.c:687) xe
[ 1481.564777] drm_ioctl_kernel (drivers/gpu/drm/drm_ioctl.c:796) drm
[ 1481.564859] drm_ioctl (drivers/gpu/drm/drm_ioctl.c:894) drm
[ 1481.564937] xe_drm_ioctl (drivers/gpu/drm/xe/xe_device.c:210) xe
[ 1481.565149] __x64_sys_ioctl (fs/ioctl.c:51 fs/ioctl.c:906 fs/ioctl.c:892 fs/ioctl.c:892)
[ 1481.565154] do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83)
[ 1481.565159] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)

Tejas
> 
> Matt
> 
> > Signed-off-by: Tejas Upadhyay <tejas.upadhyay at intel.com>
> > ---
> >  drivers/gpu/drm/xe/xe_gpu_scheduler.c | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gpu_scheduler.c
> > b/drivers/gpu/drm/xe/xe_gpu_scheduler.c
> > index 50361b4638f9..42ca3c4a299a 100644
> > --- a/drivers/gpu/drm/xe/xe_gpu_scheduler.c
> > +++ b/drivers/gpu/drm/xe/xe_gpu_scheduler.c
> > @@ -80,6 +80,14 @@ void xe_sched_fini(struct xe_gpu_scheduler *sched)
> >
> >  void xe_sched_submission_start(struct xe_gpu_scheduler *sched)  {
> > +	/* Async call to __guc_exec_queue_fini_async frees scheduler
> > +	 * while some scheduler submission would have already started.
> > +	 * To handle such small window race case, submission should only
> > +	 * start when scheduler is ready.
> > +	 */
> > +	if (!drm_sched_wqueue_ready(&sched->base))
> > +		return;
> > +
> >  	drm_sched_wqueue_start(&sched->base);
> >  	queue_work(sched->base.submit_wq, &sched->work_process_msg);
> }
> > --
> > 2.34.1
> >