[Mesa-dev] [PATCH 8/8] panfrost: Add backend targeting the DRM driver

Tue Mar 5 02:08:24 UTC 2019

On Mon, Mar 4, 2019 at 4:43 PM Alyssa Rosenzweig <alyssa at rosenzweig.io> wrote:
>
> Wooo!!!!!!!
>
> Seeing this patch in my inbox has been some of the best news all day!
>
> Without further ado, my (solicited?) comments:
>
> > +/* Copyright 2018, Linaro, Ltd., Rob Herring <robh at kernel.org> */
>
> Please triple check upstream that this license is okay; the other files
> in include/drm-uapi are BSDish.
>
> > +/* timeouts are specified in clock-monotonic absolute times (to simplify
> > + * restarting interrupted ioctls).  The following struct is logically the
> > + * same as 'struct timespec' but 32/64b ABI safe.
> > + */
> > +struct drm_panfrost_timespec {
> > +     __s64 tv_sec;          /* seconds */
> > +     __s64 tv_nsec;         /* nanoseconds */
> > +};
>
> What's this for -- is there not a shared struct for this if it's
> necessary? (I'm assuming this was copied from etna..?)
>
> > +     __u32 flags;          /* in, mask of ETNA_BO_x */
>
> As Rob said, s/ETNA/PAN/g
>
> > +     struct drm_panfrost_timespec timeout;   /* in */
>
> (Presumably we just want a timeout in one of nanoseconds / milliseconds
> / second, not the full timespec... 64-bit time gives a pretty wide range
> even for high-precision timeouts, which I don't know that we need. Also,
> it's signed for some reason...?)
>
> > +     struct drm_panfrost_gem_submit_atom_dep deps[2];
>
> I'm concerned about hardcoding deps to 2 here. I know the Arm driver
> does it, but I'm super uncomfortable by them doing it too. Keep in mind,
> when they have complex dependencies they insert dummy "dependency-only"
> jobs into the graph, though I concede I haven't seen this yet.

Why aren't we using regular dma-buf fences here? The submit ioctl
should be able to take a number of in fences to wait on and return an
out fence if requested. See I915_EXEC_FENCE_OUT and
I915_EXEC_FENCE_ARRAY in
https://cgit.freedesktop.org/~airlied/linux/tree/include/uapi/drm/i915_drm.h?h=drm-next
or MSM_SUBMIT_FENCE_FD_IN/OUT in
https://cgit.freedesktop.org/~airlied/linux/tree/include/uapi/drm/msm_drm.h?h=drm-next

> I'm not at all sure what the right number is, especially since the new
> kernel interface seems to support waiting on BOs? Or am I
> misinterpreting this?
>
> Regardless, this will start to matter when we implement support for more
> complex FBOs (where various FBOs samples from various other FBOs), which
> I think can have arbitrarily complicated dependency graphs. So
> hardcoding to [2] is inappropriate if we want to use deps for waiting on
> FBOs. On the other hand, if we use a kernel-side BO wait or fences or
> something to resolve dependencies, do we even need two deps, or just
> one?
>
> > +     // TODO cache allocations
> > +     // TODO properly handle errors
> > +     // TODO take into account extra_flags
>
> Not a blocker at all for merge, but these should be taken care of before
> we drop support for the non-DRM stuff (since at least the
> lack of GROWABLE support represents a regression compared to the Arm
> driver).
>
> > +     // TODO map and unmap on demand?
>
> I don't know if this matters too much...? Usually if we're allocating a
> slab, that means we want it mapped immediately, unless we set the
> INVISIBLE flag which means we odn't want it mapped at all.
>
> Maybe we'll have fancier scenarios in the future, but at the moment I
> think we can safely cross off at least the first half of the TODO.
>
> As you know I'm not a believer in unmapping/freeing resources ever, so I
> don't get an opinion there ;)
>
> > +     gem_info.handle = gem_new.handle;
> > +     ret = drmIoctl(drm->fd, DRM_IOCTL_PANFROST_GEM_INFO, &gem_info);
> > +     if (ret) {
> > +                fprintf(stderr, "DRM_IOCTL_PANFROST_GEM_INFO failed: %d\n", ret);
> > +             assert(0);
> > +     }
>
> Maybe a question for Rob, but why isn't this info passed along with the
> allocate in the first place? I guess the extra roundtrip to the kernel
> isn't a huge deal, but it seems a little odd...?
>
> > +static uint8_t
> > +allocate_atom()
> > +{
> > +     /* Use to allocate atom numbers for jobs. We probably want to overhaul this in kernel space at some point. */
> > +     static uint8_t atom_counter = 0;
> > +
> > +        atom_counter++;
> > +
> > +        /* Workaround quirk where atoms must be strictly positive */
> > +
> > +        if (atom_counter == 0)
> > +                atom_counter++;
> > +
> > +        return atom_counter;
> > +}
>
> So, this routine (which is copied straight from the non-DRM code) is
> specifically to workaround a quirk in the Arm driver where atom numbers
> must be non-zero u8's. I'm skeptical the DRM interface needs this.
>
> > +             idx++;
>
> Nice one, no idea why I didn't think of doing it this way :)
>
> > +panfrost_fence_create(struct panfrost_context *ctx)
>
> I'd appreciate a link to documentation about mainline fences, since I
> have no idea what's happening here :)
>
> > +static void
> > +panfrost_drm_enable_counters(struct panfrost_screen *screen)
> > +{
> > +     fprintf(stderr, "unimplemented: %s\n", __func__);
> > +}
>
> If nobody else is taking it, would anyone mind if I play with adding
> performance counters to the kernel? I want to at least dip my feet into
> the other side of the stack, and that seems like a nice low-hanging
> fruit (especially since I actually grok the performance counters, more
> or less) :)
>
> > +        return drmSyncobjCreate(drm->fd, DRM_SYNCOBJ_CREATE_SIGNALED,
>
> (See fence question)
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev