[Intel-gfx] [PATCH] dma-buf: Enhance dma-fence tracing

Tue Jan 22 09:11:53 UTC 2019

On Tue, Jan 22, 2019 at 10:06 AM Chris Wilson <chris at chris-wilson.co.uk> wrote:
>
> Quoting Koenig, Christian (2019-01-22 08:49:30)
> > Am 22.01.19 um 00:20 schrieb Chris Wilson:
> > > Rather than every backend and GPU driver reinventing the same wheel for
> > > user level debugging of HW execution, the common dma-fence framework
> > > should include the tracing infrastructure required for most client API
> > > level flow visualisation.
> > >
> > > With these common dma-fence level tracepoints, the userspace tools can
> > > establish a detailed view of the client <-> HW flow across different
> > > kernels. There is a strong ask to have this available, so that the
> > > userspace developer can effectively assess if they're doing a good job
> > > about feeding the beast of a GPU hardware.
> > >
> > > In the case of needing to look into more fine-grained details of how
> > > kernel internals work towards the goal of feeding the beast, the tools
> > > may optionally amend the dma-fence tracing information with the driver
> > > implementation specific. But for such cases, the tools should have a
> > > graceful degradation in case the expected extra tracepoints have
> > > changed or their format differs from the expected, as the kernel
> > > implementation internals are not expected to stay the same.
> > >
> > > It is important to distinguish between tracing for the purpose of client
> > > flow visualisation and tracing for the purpose of low-level kernel
> > > debugging. The latter is highly implementation specific, tied to
> > > a particular HW and driver, whereas the former addresses a common goal
> > > of user level tracing and likely a common set of userspace tools.
> > > Having made the distinction that these tracepoints will be consumed for
> > > client API tooling, we raise the spectre of tracepoint ABI stability. It
> > > is hoped that by defining a common set of dma-fence tracepoints, we avoid
> > > the pitfall of exposing low level details and so restrict ourselves only
> > > to the high level flow that is applicable to all drivers and hardware.
> > > Thus the reserved guarantee that this set of tracepoints will be stable
> > > (with the emphasis on depicting client <-> HW flow as opposed to
> > > driver <-> HW).
> > >
> > > In terms of specific changes to the dma-fence tracing, we remove the
> > > emission of the strings for every tracepoint (reserving them for
> > > dma_fence_init for cases where they have unique dma_fence_ops, and
> > > preferring to have descriptors for the whole fence context). strings do
> > > not pack as well into the ftrace ringbuffer and we would prefer to
> > > reduce the amount of indirect callbacks required for frequent tracepoint
> > > emission.
> > >
> > > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > > Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> > > Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> > > Cc: Alex Deucher <alexdeucher at gmail.com>
> > > Cc: "Christian König" <christian.koenig at amd.com>
> > > Cc: Eric Anholt <eric at anholt.net>
> > > Cc: Pierre-Loup Griffais <pgriffais at valvesoftware.com>
> > > Cc: Michael Sartain <mikesart at fastmail.com>
> > > Cc: Steven Rostedt <rostedt at goodmis.org>
> >
> > In general yes please! If possible please separate out the changes to
> > the common dma_fence infrastructure from the i915 changes.
>
> Sure, I was just stressing the impact: remove some randomly placed
> internal debugging tracepoints, try to define useful ones instead :)
>
> On the list of things to do was to convert at least 2 other drivers
> (I was thinking nouveau/msm for simplicity, vc4 for a simpler
> introduction to drm_sched than amdgpu) over to be sure we have the right
> tracepoints.

I think sprinkling these over the scheduler (maybe just as an opt-in,
for the case where the driver doesn't have some additional queueing
somewhere) would be good. I haven't checked whether it fits, but would
give you a bunch of drivers at once. It might also not cover all the
cases (I guess the wait related ones would need to be somewhere else).
-Daniel

> > One thing I'm wondering is why the enable_signaling trace point doesn't
> > need to be exported any more. Is that only used internally in the common
> > infrastructure?
>
> Right. Only used inside the core, and I don't see much call for making
> it easy for drivers to fiddle around bypassing the core
> enable_signaling/signal. (I'm not sure it's useful for client flow
> either, it feels more like dma-fence debugging, but they can just
> not listen to that tracepoint.)
> -Chris
> _______________________________________________
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel


-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch