[Intel-gfx] [PATCH 07/10] drm/i915: Gate engine stats collection with a static key

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Wed Oct 4 17:38:09 UTC 2017


On 03/10/2017 11:17, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2017-09-29 13:34:57)
>> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>
>> This reduces the cost of the software engine busyness tracking
>> to a single no-op instruction when there are no listeners.
>>
>> We add a new i915 ordered workqueue to be used only for tasks
>> not needing struct mutex.
>>
>> v2: Rebase and some comments.
>> v3: Rebase.
>> v4: Checkpatch fixes.
>> v5: Rebase.
>> v6: Use system_long_wq to avoid being blocked by struct_mutex
>>      users.
>> v7: Fix bad conflict resolution from last rebase. (Dmitry Rogozhkin)
>> v8: Rebase.
>> v9:
>>   * Fix race between unordered enable followed by disable.
>>     (Chris Wilson)
>>   * Prettify order of local variable declarations. (Chris Wilson)
> 
> Ok, I can't see a downside to enabling the optimisation even if it will
> be global and not per-device/per-engine.

For this one I did a quick test with gem_exec_nop and I've seen around 
0.5% reduction in time spend in intel_lrc_irq_handler in the case where 
PMU is not active.

So it is a bit underwhelming and unless I can get different results 
after re-measuring a few times, I'd say it is not worth the complication 
of putting this in. At least it is there in history so it can be pulled 
in if needed.

Regards,

Tvrtko



More information about the Intel-gfx mailing list