[PATCH v2] drm/msm: add trace_dma_fence_emit to msm_gpu_submit
Chia-I Wu
olvaffe at gmail.com
Wed Apr 27 17:55:48 UTC 2022
On Wed, Apr 27, 2022 at 9:07 AM Rob Clark <robdclark at chromium.org> wrote:
>
> On Tue, Apr 26, 2022 at 11:20 PM Christian König
> <christian.koenig at amd.com> wrote:
> >
> > Am 26.04.22 um 20:50 schrieb Chia-I Wu:
> > > On Tue, Apr 26, 2022 at 11:02 AM Christian König
> > > <christian.koenig at amd.com> wrote:
> > >> Am 26.04.22 um 19:40 schrieb Chia-I Wu:
> > >>> [SNIP]
> > >>>>>> Well I just send a patch to completely remove the trace point.
> > >>>>>>
> > >>>>>> As I said it absolutely doesn't make sense to use this for
> > >>>>>> visualization, that's what the trace_dma_fence_init trace point is good for.
> > >>> I am a bit confused by this. _emit and _signaled are a great way to
> > >>> see how many fences are pending from cpu's point of view. How does
> > >>> _emit make no sense and _init is good instead?
> > >> We had exactly that confusion now multiple times and it's one of the
> > >> main reasons why I want to remove the _emit trace point.
> > >>
> > >> See the when you want to know how many fences are pending you need to
> > >> watch out for init/destroy and *NOT* emit.
> > >>
> > >> The reason is that in the special case where emit makes sense (e.g. the
> > >> GPU scheduler fences) emit comes later than init, but pending on the CPU
> > >> and taking up resources are all fences and not just the one emitted to
> > >> the hardware.
> > > I am more interested in pending on the GPU.
> > >
> > >> On the other hand when you want to measure how much time each operation
> > >> took on the hardware you need to take a look at the differences of the
> > >> signal events on each timeline.
> > > _signaled alone is not enough when the GPU is not always busy. After
> > > the last pending fence signals but before the following _init/_emit,
> > > nothing is pending.
> >
> > Yeah, I'm perfectly aware of that.
> >
> > > For all drivers except virtio-gpu, _init and "ring head update" always
> > > happen close enough that I can see why _emit is redundant. But I like
> > > having _emit as a generic tracepoint for timelines where _init and
> > > _emit can be apart, instead of requiring a special case tracepoint for
> > > each special case timeline.
> >
> > And I'm certainly not going to add _emit to all drivers just because of
> > that. As you said it is a special case for virtio-gpu and the GPU scheduler.
> >
> > And as I explained before the difference between _init and _emit
> > shouldn't matter to your visualization. The background is that as soon
> > as an dma_fence is initialized with _init it is "live" regarding the
> > dependency and memory management and exactly that's what matters for the
> > visualization.
> >
> > The latency between _init and _emit is just interesting for debugging
> > the scheduler and surprisingly virtio-gpu as well, for all other use
> > cases it is irrelevant.
>
> It might actually be *more* interesting for virtio-gpu.. unless there
> is some other way to link guest and host fences to see what the
> potential latency of guest->host is
>
> re: adding the tracepoint to other drivers, I'm fine with folks doing
> that as needed. Unless you have a better proposal about how to
> visualize init vs emit latency, I think it's fine to have an extra
> tracepoint even if it is redundant in some cases. The visualization
> tool is the customer here, we have to give it what it wants/needs.
As far as perfetto is concerned, I just use either _init or _emit on a
per-timeline basis. We can drop this patch for msm, and do not need
to change drivers whose latencies between _init/_emit are ignorable.
init vs emit latency is still interesting. I prefer keeping _init /
_emit as generic events that tools can parse, rather than adding
per-driver special cases that tools need to understand.
>
> BR,
> -R
>
> >
> > Regards,
> > Christian.
> >
> > >> So there isn't really any use case for the emit trace point, except when
> > >> you want to figure out how much latency the scheduler introduce. Then
> > >> you want to take a look at init and emit, but that isn't really that
> > >> interesting for performance analyses.
> > >>
> > >> Regards,
> > >> Christian.
> > >>
> >
More information about the dri-devel
mailing list