[Intel-gfx] [PATCH i-g-t] scripts/trace.pl: Optimize event parsing and processing
John Harrison
John.C.Harrison at Intel.com
Thu Dec 21 22:44:32 UTC 2017
On 12/21/2017 1:34 AM, Tvrtko Ursulin wrote:
>
> On 20/12/2017 23:50, John Harrison wrote:
>> On 12/20/2017 1:54 AM, Tvrtko Ursulin wrote:
>>> What was the effect of all this on your big traces? I am only
>>> testing with a smaller one which goes from ~3.3s to ~2.2s. On a
>>> larger trace it might be non-linear gains due to double sort
>>> avoidance, unless there will be some other effects to cancel that out.
>>>
>>
>> So with a trace of a shortened gem_exec_nop/basic_sequential, the
>> 'perf script' output is 439MB and the original trace.pl before any of
>> the changes took ~180s. After the 'auto-detect field order' patch, it
>> went up to ~201s. With the optimisation patch it is down to ~129s.
>>
>> However, I am also seeing differences in the HTML output since the
>> optimisation patch. The differences aren't massive, just slight
>> variations in the times. The structure is all the same, its just that
>> the accounting and/or time stamps are out.
>>
>> For example:
>> {id: 1, content: 'Ring0<br><small><br>79.48% idle<br><br>34.32%
>> busy<br>584.97% runnable<br>2103.60% queued<br><br>16.18%
>> wait<br><br>200931 batches<br>331.28us avg batch<br>331.38us avg
>> engine batch<br></small>'},
>> vs
>> {id: 1, content: 'Ring0<br><small><br>79.48% idle<br><br>34.32%
>> busy<br>584.97% runnable<br>2103.60% queued<br><br>16.18%
>> wait<br><br>200931 batches<br>338.56us avg batch<br>338.56us avg
>> engine batch<br></small>'},
>>
>> Or:
>> {id: 58, key: -210383407, type: 'range', group: 4, subgroup: 2,
>> subgroupOrder: 3, content: '428/3 <small>0</small>
>> <small><i>???</i></small> <small><i>++</i></small> <br>142us
>> <small>(0us)</small>', start: '2017-01-05 21:27:45.352968', end:
>> '2017-01-05 21:27:45.353110', style: 'color: white; background-color:
>> red;'},
>> vs
>> {id: 58, key: -210383407, type: 'range', group: 4, subgroup: 2,
>> subgroupOrder: 3, content: '428/3 <small>0</small>
>> <small><i>???</i></small> <small><i>++</i></small> <br>159us
>> <small>(0us)</small>', start: '2017-01-05 21:27:45.352968', end:
>> '2017-01-05 21:27:45.353127', style: 'color: white; background-color:
>> red;'},
>>
>> I can send you the full output if it is useful and the source logs
>> too. The HTML output is about 840KB but as noted, the perf logs are
>> hundreds of MBs.
>
> I was able to reproduce it. I think it's down to floating point
> mishandling. Neither the original nor the optimized version were
> correct and both are accumulating error. I'll send an updated patch
> shortly.
>
> Regards,
>
> Tvrtko
The new version with +0.5 gives me exactly the same results as the
previous edition without the rounding. If I get rid of the int()
conversion too, I still get the same results.
What does make a difference is if I fix up the broken processing of
notify traces:
diff --git a/scripts/trace.pl b/scripts/trace.pl
index 8dbd497b..20ae3c78 100755
--- a/scripts/trace.pl
+++ b/scripts/trace.pl
@@ -382,14 +382,16 @@ while (<>) {
next if exists $tp{'ring'} and exists $ignore_ring{$tp{'ring'}};
- if (exists $tp{'ring'} and exists $tp{'ctx'} and exists
$tp{'seqno'}) {
+ if (exists $tp{'ring'} and exists $tp{'seqno'}) {
$ring = $tp{'ring'};
- $ctx = $tp{'ctx'};
$seqno = $tp{'seqno'};
- $orig_ctx = $ctx;
- $ctx = sanitize_ctx($ctx, $ring);
- $key = db_key($ring, $ctx, $seqno);
+ if (exists $tp{'ctx'}) {
+ $ctx = $tp{'ctx'};
+ $orig_ctx = $ctx;
+ $ctx = sanitize_ctx($ctx, $ring);
+ $key = db_key($ring, $ctx, $seqno);
+ }
}
if ($tp_name eq 'i915:i915_gem_request_wait_begin:') {
@@ -461,7 +463,7 @@ while (<>) {
}
$db{$key}->{'duration'} = $db{$key}->{'notify'} -
$db{$key}->{'start'};
$db{$key}->{'context-complete-delay'} =
$db{$key}->{'end'} - $db{$key}->{'notify'};
- } elsif ($tp_name eq 'intel_engine_notify:') {
+ } elsif ($tp_name eq 'i915:intel_engine_notify:') {
$notify{global_key($ring, $seqno)} = $time;
} elsif ($tp_name eq 'i915:intel_gpu_freq_change:') {
push @freqs, [$prev_freq_ts, $time, $prev_freq] if
$prev_freq;
With that, I get exactly the same HTML page as before the optimisation
patch.
John.
More information about the Intel-gfx
mailing list