[Intel-gfx] [PATCH 0/4][RFC] Dynamic EU configuration of Slice/Subslice/EU.

Joonas Lahtinen joonas.lahtinen at linux.intel.com
Thu Sep 27 14:02:21 UTC 2018


+ Tvrtko for adding the right media contacts

Quoting kedar.j.karanje at intel.com (2018-09-21 12:13:46)
> From: "Kedar J. Karanje" <kedar.j.karanje at intel.com>
> 
> drm/i915: Context aware user agnostic EU/Slice/Sub-slice control within kernel
> 
> Current GPU configuration code for i915 does not allow us to change
> EU/Slice/Sub-slice configuration dynamically. Its done only once while context
> is created.
> 
> While particular graphics application is running, if we examine the command
> requests from user space, we observe that command density is not consistent.
> It means there is scope to change the graphics configuration dynamically even
> while context is running actively. This patch series proposes the solution to
> find the active pending load for all active context at given time and based on
> that, dynamically perform graphics configuration for each context.
> 
> We use a hr (high resolution) timer with i915 driver in kernel to get a
> callback every few milliseconds (this timer value can be configured through
> debugfs, default is '0' indicating timer is in disabled state i.e. original
> system without any intervention).In the timer callback, we examine pending
> commands for a context in the queue, essentially, we intercept them before
> they are executed by GPU and we update context with required number of EUs.

The off-by-default and enabling through debugfs nature would make this
effectively dead code for upstream as debugfs is off limits in many
production systems for the security concerns.

So the algorithm should really be generic enough to be on-by-default
without regressing existing workloads. Otherwise there would need to
be some tuning tool to control the tables outside of debugfs. cgroups
could be one place, but the effort to bring cgroups doesn't seem to be
advancing, please see mailing list archive.

There's also an ongoing series about allowing userspace to control the
said SSEU register freely for Media, so I'd hope we could consolidate on
one control mechanism.

Regards, Joonas

> 
> Two questions, how did we arrive at right timer value? and what's the right
> number of EUs? For the prior one, empirical data to achieve best performance
> in least power was considered. For the later one, we roughly categorized number 
> of EUs logically based on platform. Now we compare number of pending commands
> with a particular threshold and then set number of EUs accordingly with update
> context. That threshold is also based on experiments & findings. If GPU is able
> to catch up with CPU, typically there are no pending commands, the EU config
> would remain unchanged there. In case there are more pending commands we
> reprogram context with higher number of EUs. Please note, here we are changing
> EUs even while context is running by examining pending commands every 'x'
> milliseconds.
> 
> With this solution in place, on KBL-GT3 + Android we saw following pnp
> benefits without any performance degradation, power numbers mentioned here
> are system power.
> 
> App /KPI               | % Power |
>                        | Benefit |
>                        |  (mW)   |
> ---------------------------------|
> 3D Mark (Ice storm)    | 2.30%   |
> TRex On screen         | 2.49%   |
> TRex Off screen        | 1.32%   |
> ManhattanOn screen     | 3.11%   |
> Manhattan Off screen   | 0.89%   |
> AnTuTu  6.1.4          | 3.42%   |
> 
> Note - For KBL (GEN9) we cannot control at sub-slice level, it was always  a
> constraint.
> We always controlled number of EUs rather than sub-slices/slices.
> 
> 
> Praveen Diwakar (4):
>   drm/i915: Get active pending request for given context
>   drm/i915: Update render power clock state configuration for given
>     context
>   drm/i915: set optimum eu/slice/sub-slice configuration based on load
>     type
>   drm/i915: Predictive governor to control eu/slice/subslice based on
>     workload
> 
>  drivers/gpu/drm/i915/i915_debugfs.c        | 94 +++++++++++++++++++++++++++++-
>  drivers/gpu/drm/i915/i915_drv.c            |  1 +
>  drivers/gpu/drm/i915/i915_drv.h            |  6 ++
>  drivers/gpu/drm/i915/i915_gem_context.c    | 52 +++++++++++++++++
>  drivers/gpu/drm/i915/i915_gem_context.h    | 52 +++++++++++++++++
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  5 ++
>  drivers/gpu/drm/i915/intel_lrc.c           | 47 +++++++++++++++
>  7 files changed, 256 insertions(+), 1 deletion(-)
> 
> --
> 2.7.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx


More information about the Intel-gfx mailing list