[Intel-gfx] [PATCH 2/2] drm/i915/tracepoints: Remove DRM_I915_LOW_LEVEL_TRACEPOINTS Kconfig option

Wed Aug 8 12:56:01 UTC 2018

+Joonas

On 08/08/2018 13:42, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-08-08 13:13:08)
>>
>> On 26/06/2018 12:48, Chris Wilson wrote:
>>> It's just that this about the third time this has been raised in the
>>> last couple of weeks with the other two requests being from a generic
>>> tooling pov (Eric Anholt for gnome-shell tweaking, and some one
>>> else looking for a gpuvis-like tool). So it seems like there is
>>> interest, even if I doubt that it'll help answer any questions beyond
>>> what you can just extract from looking at userspace. (Imo, the only
>>> people these tracepoints are useful for are people writing patches for
>>> the driver. For everyone else, you can just observe system behaviour and
>>> optimise your code for your workload. Otoh, can one trust a black
>>> box, argh.)
>>
>> Some of the things might be obtainable purely from userspace via heavily
>> instrumented builds, which may be in the realm of possible for during
>> development, but I don't think it is feasible in general both because it
>> is too involved, and because it would preclude existence of tools which
>> can trace any random client.
>>
>>> To have a second set of nearly equivalent tracepoints, we need to have
>>> strong justification why we couldn't just use or extend the generic set.
>>
>> I was hoping that the conversation so far established that nearly
>> equivalent is not close enough for intended use cases. And that is not
>> possible to make the generic ones so.
> 
> (I just don't see the point of those use cases. I trace the kernel to
> fix the kernel...)

Yes and with virtual engine we will have a bigger reason to trace the 
kernel with a random client.

>   
>>> Plus I feel a lot more comfortable exporting a set of generic
>>> tracepoints, than those where we may be leaking more knowledge of the HW
>>> than we can reasonably expect to support for the indefinite future.
>>
>> I think it is accepted we cannot guarantee low level tracepoints will be
>> supportable in the future world of GuC scheduling. (How and what we will
>> do there is yet unresolved.) But at least we get much better usability
>> for platforms up to there, and for very small effort. The idea is not to
>> mark these as ABI but just improve user experience.
>>
>> You are I suppose worried that if these tracepoints disappeared due
>> being un-implementable someone will complain?
> 
> They already do...
>   
>> I just want that anyone can run trace.pl and see how virtual engine
>> behaves, without having to recompile the kernel. And VTune people want
>> the same for their enterprise-level customers. Both tools are ready to
>> adapt should it be required. Its I repeat just usability and user
>> experience out of the box.
> 
> The out-of-the-box user experience should not require the use of such
> tools in the first place! If they are trying to work around the kernel
> (and that's the only use of this information I see) we have bugs a
> plenty.
> 
> [snip because I repeated myself]
> 
> I think my issues boil down to:
> 
>   1 - people will complain no matter what (when it changes, when it is no
>       longer available)
> 
>   2 - people will use it to workaround not fix; the information about kernel
>       behaviour should only be used with a view to fixing that behaviour
> 
> As such, I am quite happy to have it limited to driver developers that
> want to fix issues at source (OpenCL, I'm looking at you). There's tons
> of other user observable information out there for tuning userspace,
> why does the latency of runnable->queued matter if you will not do anything
> about it? Other things like dependency graphs, if you can't keep control
> of your own fences, you've already lost.

This is true, no disagreement. My point simply was that we can provide 
this info easily to anyone. There is a little bit of analogy with perf 
scheduler tracing/map etc.

> I don't see any value in giving the information away, just the cost. If
> you can convince Joonas of its merit, and if we can define just exactly
> what ABI it constitutes, then I'd be happy to be the one who says "I
> told you so" in the future for a change.

I think Joonas was okay in principle that we soft-commit to _trying_ to 
keep _some_ tracepoint stable-ish (where it makes sense and after some 
discussion for each) if IGT also materializes which auto-pings us (via 
CI) when we break one of them. But I may be misremembering so Joonas 
please comment.

Regards,

Tvrtko