[Lima] [PATCH v8] drm/lima: driver for ARM Mali4xx GPUs

Tue Mar 12 01:54:57 UTC 2019

On Mon, Mar 11, 2019 at 11:37 PM Rob Herring <robh at kernel.org> wrote:
>
> On Sat, Mar 9, 2019 at 6:21 AM Qiang Yu <yuq825 at gmail.com> wrote:
> >
> > - Mali 4xx GPUs have two kinds of processors GP and PP. GP is for
> >   OpenGL vertex shader processing and PP is for fragment shader
> >   processing. Each processor has its own MMU so prcessors work in
> >   virtual address space.
> > - There's only one GP but multiple PP (max 4 for mali 400 and 8
> >   for mali 450) in the same mali 4xx GPU. All PPs are grouped
> >   togather to handle a single fragment shader task divided by
> >   FB output tiled pixels. Mali 400 user space driver is
> >   responsible for assign target tiled pixels to each PP, but mali
> >   450 has a HW module called DLBU to dynamically balance each
> >   PP's load.
> > - User space driver allocate buffer object and map into GPU
> >   virtual address space, upload command stream and draw data with
> >   CPU mmap of the buffer object, then submit task to GP/PP with
> >   a register frame indicating where is the command stream and misc
> >   settings.
> > - There's no command stream validation/relocation due to each user
> >   process has its own GPU virtual address space. GP/PP's MMU switch
> >   virtual address space before running two tasks from different
> >   user process. Error or evil user space code just get MMU fault
> >   or GP/PP error IRQ, then the HW/SW will be recovered.
> > - Use GEM+shmem for MM. Currently just alloc and pin memory when
> >   gem object creation. GPU vm map of the buffer is also done in
> >   the alloc stage in kernel space. We may delay the memory
> >   allocation and real GPU vm map to command submission stage in the
> >   furture as improvement.
> > - Use drm_sched for GPU task schedule. Each OpenGL context should
> >   have a lima context object in the kernel to distinguish tasks
> >   from different user. drm_sched gets task from each lima context
> >   in a fair way.
> >
> > mesa driver can be found here before upstreamed:
> > https://gitlab.freedesktop.org/lima/mesa
> >
> > v8:
> > - add comments for in_sync
> > - fix ctx free miss mutex unlock
> >
> > v7:
> > - remove lima_fence_ops with default value
> > - move fence slab create to device probe
> > - check pad ioctl args to be zero
> > - add comments for user/kernel interface
> >
> > v6:
> > - fix comments by checkpatch.pl
> >
> > v5:
> > - export gp/pp version to userspace
> > - rebase on drm-misc-next
> >
> > v4:
> > - use get param interface to get info
> > - separate context create/free ioctl
> > - remove unused max sched task param
> > - update copyright time
> > - use xarray instead of idr
> > - stop using drmP.h
> >
> > v3:
> > - fix comments from kbuild robot
> > - restrict supported arch to tested ones
> >
> > v2:
> > - fix syscall argument check
> > - fix job finish fence leak since kernel 5.0
> > - use drm syncobj to replace native fence
> > - move buffer object GPU va map into kernel
> > - reserve syscall argument space for future info
> > - remove kernel gem modifier
> > - switch TTM back to GEM+shmem MM
> > - use time based io poll
> > - use whole register name
> > - adopt gem reservation obj integration
> > - use drm_timeout_abs_to_jiffies
> >
> > Cc: Eric Anholt <eric at anholt.net>
> > Cc: Rob Herring <robh at kernel.org>
> > Cc: Christian König <ckoenig.leichtzumerken at gmail.com>
> > Cc: Daniel Vetter <daniel at ffwll.ch>
> > Cc: Alex Deucher <alexdeucher at gmail.com>
> > Cc: Sam Ravnborg <sam at ravnborg.org>
> > Cc: Rob Clark <robdclark at gmail.com>
> > Cc: Dave Airlie <airlied at gmail.com>
> > Signed-off-by: Andreas Baierl <ichgeh at imkreisrum.de>
> > Signed-off-by: Erico Nunes <nunes.erico at gmail.com>
> > Signed-off-by: Heiko Stuebner <heiko at sntech.de>
> > Signed-off-by: Marek Vasut <marex at denx.de>
> > Signed-off-by: Neil Armstrong <narmstrong at baylibre.com>
> > Signed-off-by: Simon Shields <simon at lineageos.org>
> > Signed-off-by: Vasily Khoruzhick <anarsoul at gmail.com>
> > Signed-off-by: Qiang Yu <yuq825 at gmail.com>
> > Reviewed-by: Eric Anholt <eric at anholt.net>
> > Reviewed-by: Rob Herring <robh at kerrnel.org>
> > ---
>
> [...]

I thought get your RB last time, should I remove it?

>
> > +static int lima_gem_lock_bos(struct lima_bo **bos, u32 nr_bos,
> > +                            struct ww_acquire_ctx *ctx)
> > +{
> > +       int i, ret = 0, contended, slow_locked = -1;
> > +
> > +       ww_acquire_init(ctx, &reservation_ww_class);
> > +
> > +retry:
> > +       for (i = 0; i < nr_bos; i++) {
> > +               if (i == slow_locked) {
> > +                       slow_locked = -1;
> > +                       continue;
> > +               }
> > +
> > +               ret = ww_mutex_lock_interruptible(&bos[i]->gem.resv->lock, ctx);
> > +               if (ret < 0) {
> > +                       contended = i;
> > +                       goto err;
> > +               }
> > +       }
> > +
> > +       ww_acquire_done(ctx);
> > +       return 0;
> > +
> > +err:
> > +       for (i--; i >= 0; i--)
> > +               ww_mutex_unlock(&bos[i]->gem.resv->lock);
> > +
> > +       if (slow_locked >= 0)
> > +               ww_mutex_unlock(&bos[slow_locked]->gem.resv->lock);
> > +
> > +       if (ret == -EDEADLK) {
> > +               /* we lost out in a seqno race, lock and retry.. */
> > +               ret = ww_mutex_lock_slow_interruptible(
> > +                       &bos[contended]->gem.resv->lock, ctx);
> > +               if (!ret) {
> > +                       slow_locked = contended;
> > +                       goto retry;
> > +               }
> > +       }
> > +       ww_acquire_fini(ctx);
> > +
> > +       return ret;
> > +}
> > +
> > +static void lima_gem_unlock_bos(struct lima_bo **bos, u32 nr_bos,
> > +                               struct ww_acquire_ctx *ctx)
> > +{
> > +       int i;
> > +
> > +       for (i = 0; i < nr_bos; i++)
> > +               ww_mutex_unlock(&bos[i]->gem.resv->lock);
> > +       ww_acquire_fini(ctx);
> > +}
>
> Not to make you keep shooting for a moving target, Eric just posted a
> patch[1] a few days ago that can replace these 2 functions. Would be
> good to use if you respin, but otherwise can be a follow-on patch.

Thanks for the remind.

Regards,
Qiang