[PATCH v3 04/10] drm/msm/A6xx: Implement preemption for A7XX targets

Mon Sep 9 14:40:07 UTC 2024

On Mon, Sep 9, 2024 at 6:43 AM Connor Abbott <cwabbott0 at gmail.com> wrote:
>
> On Mon, Sep 9, 2024 at 2:15 PM Antonino Maniscalco
> <antomani103 at gmail.com> wrote:
> >
> > On 9/6/24 9:54 PM, Akhil P Oommen wrote:
> > > On Thu, Sep 05, 2024 at 04:51:22PM +0200, Antonino Maniscalco wrote:
> > >> This patch implements preemption feature for A6xx targets, this allows
> > >> the GPU to switch to a higher priority ringbuffer if one is ready. A6XX
> > >> hardware as such supports multiple levels of preemption granularities,
> > >> ranging from coarse grained(ringbuffer level) to a more fine grained
> > >> such as draw-call level or a bin boundary level preemption. This patch
> > >> enables the basic preemption level, with more fine grained preemption
> > >> support to follow.
> > >>
> > >> Signed-off-by: Sharat Masetty <smasetty at codeaurora.org>
> > >> Signed-off-by: Antonino Maniscalco <antomani103 at gmail.com>
> > >> Tested-by: Neil Armstrong <neil.armstrong at linaro.org> # on SM8650-QRD
> > >> ---
> > >>   drivers/gpu/drm/msm/Makefile              |   1 +
> > >>   drivers/gpu/drm/msm/adreno/a6xx_gpu.c     | 293 +++++++++++++++++++++-
> > >>   drivers/gpu/drm/msm/adreno/a6xx_gpu.h     | 161 ++++++++++++
> > ...
> > >
> > > we can use the lighter smp variant here.
> > >
> > >> +
> > >> +            if (a6xx_gpu->cur_ring == ring)
> > >> +                    gpu_write(gpu, REG_A6XX_CP_RB_WPTR, wptr);
> > >> +            else
> > >> +                    ring->skip_inline_wptr = true;
> > >> +    } else {
> > >> +            ring->skip_inline_wptr = true;
> > >> +    }
> > >> +
> > >> +    spin_unlock_irqrestore(&ring->preempt_lock, flags);
> > >>   }
> > >>
> > >>   static void get_stats_counter(struct msm_ringbuffer *ring, u32 counter,
> > >> @@ -138,12 +231,14 @@ static void a6xx_set_pagetable(struct a6xx_gpu *a6xx_gpu,
> > >
> > > set_pagetable checks "cur_ctx_seqno" to see if pt switch is needed or
> > > not. This is currently not tracked separately for each ring. Can you
> > > please check that?
> >
> > I totally missed that. Thanks for catching it!
> >
> > >
> > > I wonder why that didn't cause any gpu errors in testing. Not sure if I
> > > am missing something.
> > >
> >
> > I think this is because, so long as a single context doesn't submit to
> > two different rings with differenr priorities, we will only be incorrect
> > in the sense that we emit more page table switches than necessary and
> > never less. However untrusted userspace could create a context that
> > submits to two different rings and that would lead to execution in the
> > wrong context so we must fix this.
>
> FWIW, in Mesa in the future we may want to expose multiple Vulkan
> queues per device. Then this would definitely blow up.

This will actually be required by future android versions, with the
switch to vk hwui backend (because apparently locking is hard, the
solution was to use different queue's for different threads)

https://gitlab.freedesktop.org/mesa/mesa/-/issues/11326

BR,
-R