[Intel-gfx] [PATCH i-g-t] scripts/trace.pl: Optimize event parsing and processing

John Harrison John.C.Harrison at Intel.com
Thu Dec 21 22:44:32 UTC 2017


On 12/21/2017 1:34 AM, Tvrtko Ursulin wrote:
>
> On 20/12/2017 23:50, John Harrison wrote:
>> On 12/20/2017 1:54 AM, Tvrtko Ursulin wrote:
>>> What was the effect of all this on your big traces? I am only 
>>> testing with a smaller one which goes from ~3.3s to ~2.2s. On a 
>>> larger trace it might be non-linear gains due to double sort 
>>> avoidance, unless there will be some other effects to cancel that out.
>>>
>>
>> So with a trace of a shortened gem_exec_nop/basic_sequential, the 
>> 'perf script' output is 439MB and the original trace.pl before any of 
>> the changes took ~180s. After the 'auto-detect field order' patch, it 
>> went up to ~201s. With the optimisation patch it is down to ~129s.
>>
>> However, I am also seeing differences in the HTML output since the 
>> optimisation patch. The differences aren't massive, just slight 
>> variations in the times. The structure is all the same, its just that 
>> the accounting and/or time stamps are out.
>>
>> For example:
>>      {id: 1, content: 'Ring0<br><small><br>79.48% idle<br><br>34.32% 
>> busy<br>584.97% runnable<br>2103.60% queued<br><br>16.18% 
>> wait<br><br>200931 batches<br>331.28us avg batch<br>331.38us avg 
>> engine batch<br></small>'},
>> vs
>>      {id: 1, content: 'Ring0<br><small><br>79.48% idle<br><br>34.32% 
>> busy<br>584.97% runnable<br>2103.60% queued<br><br>16.18% 
>> wait<br><br>200931 batches<br>338.56us avg batch<br>338.56us avg 
>> engine batch<br></small>'},
>>
>> Or:
>>      {id: 58, key: -210383407,  type: 'range', group: 4, subgroup: 2, 
>> subgroupOrder: 3, content: '428/3 <small>0</small> 
>> <small><i>???</i></small> <small><i>++</i></small> <br>142us 
>> <small>(0us)</small>', start: '2017-01-05 21:27:45.352968', end: 
>> '2017-01-05 21:27:45.353110', style: 'color: white; background-color: 
>> red;'},
>> vs
>>      {id: 58, key: -210383407,  type: 'range', group: 4, subgroup: 2, 
>> subgroupOrder: 3, content: '428/3 <small>0</small> 
>> <small><i>???</i></small> <small><i>++</i></small> <br>159us 
>> <small>(0us)</small>', start: '2017-01-05 21:27:45.352968', end: 
>> '2017-01-05 21:27:45.353127', style: 'color: white; background-color: 
>> red;'},
>>
>> I can send you the full output if it is useful and the source logs 
>> too. The HTML output is about 840KB but as noted, the perf logs are 
>> hundreds of MBs.
>
> I was able to reproduce it. I think it's down to floating point 
> mishandling. Neither the original nor the optimized version were 
> correct and both are accumulating error. I'll send an updated patch 
> shortly.
>
> Regards,
>
> Tvrtko

The new version with +0.5 gives me exactly the same results as the 
previous edition without the rounding. If I get rid of the int() 
conversion too, I still get the same results.

What does make a difference is if I fix up the broken processing of 
notify traces:
diff --git a/scripts/trace.pl b/scripts/trace.pl
index 8dbd497b..20ae3c78 100755
--- a/scripts/trace.pl
+++ b/scripts/trace.pl
@@ -382,14 +382,16 @@ while (<>) {

         next if exists $tp{'ring'} and exists $ignore_ring{$tp{'ring'}};

-       if (exists $tp{'ring'} and exists $tp{'ctx'} and exists 
$tp{'seqno'}) {
+       if (exists $tp{'ring'} and exists $tp{'seqno'}) {
                 $ring = $tp{'ring'};
-               $ctx = $tp{'ctx'};
                 $seqno = $tp{'seqno'};

-               $orig_ctx = $ctx;
-               $ctx = sanitize_ctx($ctx, $ring);
-               $key = db_key($ring, $ctx, $seqno);
+               if (exists $tp{'ctx'}) {
+                       $ctx = $tp{'ctx'};
+                       $orig_ctx = $ctx;
+                       $ctx = sanitize_ctx($ctx, $ring);
+                       $key = db_key($ring, $ctx, $seqno);
+               }
         }

         if ($tp_name eq 'i915:i915_gem_request_wait_begin:') {
@@ -461,7 +463,7 @@ while (<>) {
                 }
                 $db{$key}->{'duration'} = $db{$key}->{'notify'} - 
$db{$key}->{'start'};
                 $db{$key}->{'context-complete-delay'} = 
$db{$key}->{'end'} - $db{$key}->{'notify'};
-       } elsif ($tp_name eq 'intel_engine_notify:') {
+       } elsif ($tp_name eq 'i915:intel_engine_notify:') {
                 $notify{global_key($ring, $seqno)} = $time;
         } elsif ($tp_name eq 'i915:intel_gpu_freq_change:') {
                 push @freqs, [$prev_freq_ts, $time, $prev_freq] if 
$prev_freq;

With that, I get exactly the same HTML page as before the optimisation 
patch.

John.



More information about the Intel-gfx mailing list