[Intel-gfx] [PATCH 07/10] drm/i915: Gate engine stats collection with a static key
Chris Wilson
chris at chris-wilson.co.uk
Wed Oct 4 17:49:16 UTC 2017
Quoting Tvrtko Ursulin (2017-10-04 18:38:09)
>
> On 03/10/2017 11:17, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2017-09-29 13:34:57)
> >> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> >>
> >> This reduces the cost of the software engine busyness tracking
> >> to a single no-op instruction when there are no listeners.
> >>
> >> We add a new i915 ordered workqueue to be used only for tasks
> >> not needing struct mutex.
> >>
> >> v2: Rebase and some comments.
> >> v3: Rebase.
> >> v4: Checkpatch fixes.
> >> v5: Rebase.
> >> v6: Use system_long_wq to avoid being blocked by struct_mutex
> >> users.
> >> v7: Fix bad conflict resolution from last rebase. (Dmitry Rogozhkin)
> >> v8: Rebase.
> >> v9:
> >> * Fix race between unordered enable followed by disable.
> >> (Chris Wilson)
> >> * Prettify order of local variable declarations. (Chris Wilson)
> >
> > Ok, I can't see a downside to enabling the optimisation even if it will
> > be global and not per-device/per-engine.
>
> For this one I did a quick test with gem_exec_nop and I've seen around
> 0.5% reduction in time spend in intel_lrc_irq_handler in the case where
> PMU is not active.
Hmm, gem_exec_nop isn't going to be favourable as there we are just
extending the busyness coverage of an engine. I think you want something
like gem_sync/sequential (or gem_exec_whisper), as there each engine
will be starting and stopping, and delays between engines will
accumulate.
-Chris
More information about the Intel-gfx
mailing list