[Intel-gfx] [PATCH 2/2] drm/i915/tracepoints: Remove DRM_I915_LOW_LEVEL_TRACEPOINTS Kconfig option

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Tue Jun 26 11:24:51 UTC 2018


On 26/06/2018 11:55, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-06-26 11:46:51)
>>
>> On 25/06/2018 21:02, Chris Wilson wrote:
>>> If we know what is wanted can we define that better in terms of
>>> dma_fence and leave lowlevel for debugging (or think of how we achieve
>>> the same with generic bpf? kprobes)? Hmm, I wonder how far we can push
>>> that.
>>
>> What is wanted is for instance take trace.pl on any kernel anywhere and
>> it is able to deduce/draw the exact metrics/timeline of command
>> submission for an workload.
>>
>> At the moment it without low level tracepoints, and without the
>> intel_engine_notify tweak, it is workload dependent on how close it
>> could get.
> 
> Interjecting what dma-fence already has (or we could use), not sure how
> well userspace can actually map it to their timelines.
>>
>> So a set of tracepoints to allow drawing the timeline:
>>
>> 1. request_queue (or _add)
> dma_fence_init
> 
>> 2. request_submit
> 
>> 3. intel_engine_notify
> For obvious reasons, no match in dma_fence.
> 
>> 4. request_in
> dma_fence_emit
> 
>> 5. request out
> dma_fence_signal (similar, not quite, we would have to force irq
> signaling).

Yes not quite the same due potential time shift between user interrupt 
and dma_fence_signal call via different paths.

>   
>> With this set the above is possible and we don't need a lot of work to
>> get there.
> 
>  From a brief glance we are missing a dma_fence_queue for request_submit
> replacement.
> 
> So next question is what information do we get from our tracepoints (or
> more precisely do you use) that we lack in dma_fence?

Port=%u and preemption (completed=%u) comes immediately to mind. Way to 
tie with engines would be nice or it is all abstract timelines.

Going this direction sounds like a long detour to get where we almost 
are. I suspect you are valuing the benefit of it being generic and hence 
and parsing tool could be cross-driver. But you can also just punt the 
"abstractising" into the parsing tool.

>> And with the Virtual Engine it will become more interesting to have
>> this. So if we had a bug report saying load balancing is not working
>> well, we could just say "please run it via trace.pl --trace and attach
>> perf script output". That way we could easily see whether or not is is a
>> problem in userspace behaviour or else.
> 
> And there I was wanting a script to capture the workload so that we
> could replay it and dissect it. :-p

Depends on what level you want that. Perf script output from the above 
tracepoints would do on one level. If you wanted a higher level to 
re-exercise load balancing then it wouldn't completely be enough, or at 
least a lot of guesswork would be needed.

Regards,

Tvrtko


More information about the Intel-gfx mailing list