[PATCH v2] drm/msm: add trace_dma_fence_emit to msm_gpu_submit

Rob Clark robdclark at chromium.org
Wed Apr 27 16:07:06 UTC 2022


On Tue, Apr 26, 2022 at 11:20 PM Christian König
<christian.koenig at amd.com> wrote:
>
> Am 26.04.22 um 20:50 schrieb Chia-I Wu:
> > On Tue, Apr 26, 2022 at 11:02 AM Christian König
> > <christian.koenig at amd.com> wrote:
> >> Am 26.04.22 um 19:40 schrieb Chia-I Wu:
> >>> [SNIP]
> >>>>>> Well I just send a patch to completely remove the trace point.
> >>>>>>
> >>>>>> As I said it absolutely doesn't make sense to use this for
> >>>>>> visualization, that's what the trace_dma_fence_init trace point is good for.
> >>> I am a bit confused by this.  _emit and _signaled are a great way to
> >>> see how many fences are pending from cpu's point of view.  How does
> >>> _emit make no sense and _init is good instead?
> >> We had exactly that confusion now multiple times and it's one of the
> >> main reasons why I want to remove the _emit trace point.
> >>
> >> See the when you want to know how many fences are pending you need to
> >> watch out for init/destroy and *NOT* emit.
> >>
> >> The reason is that in the special case where emit makes sense (e.g. the
> >> GPU scheduler fences) emit comes later than init, but pending on the CPU
> >> and taking up resources are all fences and not just the one emitted to
> >> the hardware.
> > I am more interested in pending on the GPU.
> >
> >> On the other hand when you want to measure how much time each operation
> >> took on the hardware you need to take a look at the differences of the
> >> signal events on each timeline.
> > _signaled alone is not enough when the GPU is not always busy.  After
> > the last pending fence signals but before the following _init/_emit,
> > nothing is pending.
>
> Yeah, I'm perfectly aware of that.
>
> > For all drivers except virtio-gpu, _init and "ring head update" always
> > happen close enough that I can see why _emit is redundant.  But I like
> > having _emit as a generic tracepoint for timelines where _init and
> > _emit can be apart, instead of requiring a special case tracepoint for
> > each special case timeline.
>
> And I'm certainly not going to add _emit to all drivers just because of
> that. As you said it is a special case for virtio-gpu and the GPU scheduler.
>
> And as I explained before the difference between _init and _emit
> shouldn't matter to your visualization. The background is that as soon
> as an dma_fence is initialized with _init it is "live" regarding the
> dependency and memory management and exactly that's what matters for the
> visualization.
>
> The latency between _init and _emit is just interesting for debugging
> the scheduler and surprisingly virtio-gpu as well, for all other use
> cases it is irrelevant.

It might actually be *more* interesting for virtio-gpu.. unless there
is some other way to link guest and host fences to see what the
potential latency of guest->host is

re: adding the tracepoint to other drivers, I'm fine with folks doing
that as needed.  Unless you have a better proposal about how to
visualize init vs emit latency, I think it's fine to have an extra
tracepoint even if it is redundant in some cases.  The visualization
tool is the customer here, we have to give it what it wants/needs.

BR,
-R

>
> Regards,
> Christian.
>
> >> So there isn't really any use case for the emit trace point, except when
> >> you want to figure out how much latency the scheduler introduce. Then
> >> you want to take a look at init and emit, but that isn't really that
> >> interesting for performance analyses.
> >>
> >> Regards,
> >> Christian.
> >>
>


More information about the dri-devel mailing list