[Intel-gfx] [PATCH i-g-t] scripts/trace.pl: Optimize event parsing and processing

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Thu Dec 21 09:34:13 UTC 2017


On 20/12/2017 23:50, John Harrison wrote:
> On 12/20/2017 1:54 AM, Tvrtko Ursulin wrote:
>> What was the effect of all this on your big traces? I am only testing 
>> with a smaller one which goes from ~3.3s to ~2.2s. On a larger trace 
>> it might be non-linear gains due to double sort avoidance, unless 
>> there will be some other effects to cancel that out.
>>
> 
> So with a trace of a shortened gem_exec_nop/basic_sequential, the 'perf 
> script' output is 439MB and the original trace.pl before any of the 
> changes took ~180s. After the 'auto-detect field order' patch, it went 
> up to ~201s. With the optimisation patch it is down to ~129s.
> 
> However, I am also seeing differences in the HTML output since the 
> optimisation patch. The differences aren't massive, just slight 
> variations in the times. The structure is all the same, its just that 
> the accounting and/or time stamps are out.
> 
> For example:
>      {id: 1, content: 'Ring0<br><small><br>79.48% idle<br><br>34.32% 
> busy<br>584.97% runnable<br>2103.60% queued<br><br>16.18% 
> wait<br><br>200931 batches<br>331.28us avg batch<br>331.38us avg engine 
> batch<br></small>'},
> vs
>      {id: 1, content: 'Ring0<br><small><br>79.48% idle<br><br>34.32% 
> busy<br>584.97% runnable<br>2103.60% queued<br><br>16.18% 
> wait<br><br>200931 batches<br>338.56us avg batch<br>338.56us avg engine 
> batch<br></small>'},
> 
> Or:
>      {id: 58, key: -210383407,  type: 'range', group: 4, subgroup: 2, 
> subgroupOrder: 3, content: '428/3 <small>0</small> 
> <small><i>???</i></small> <small><i>++</i></small> <br>142us 
> <small>(0us)</small>', start: '2017-01-05 21:27:45.352968', end: 
> '2017-01-05 21:27:45.353110', style: 'color: white; background-color: 
> red;'},
> vs
>      {id: 58, key: -210383407,  type: 'range', group: 4, subgroup: 2, 
> subgroupOrder: 3, content: '428/3 <small>0</small> 
> <small><i>???</i></small> <small><i>++</i></small> <br>159us 
> <small>(0us)</small>', start: '2017-01-05 21:27:45.352968', end: 
> '2017-01-05 21:27:45.353127', style: 'color: white; background-color: 
> red;'},
> 
> I can send you the full output if it is useful and the source logs too. 
> The HTML output is about 840KB but as noted, the perf logs are hundreds 
> of MBs.

I was able to reproduce it. I think it's down to floating point 
mishandling. Neither the original nor the optimized version were correct 
and both are accumulating error. I'll send an updated patch shortly.

Regards,

Tvrtko


More information about the Intel-gfx mailing list