[Intel-gfx] [PATCH v2 0/4] Dynamic EU configuration of Slice/Subslice/EU.

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Wed Nov 7 10:38:19 UTC 2018


On 06/11/2018 04:13, Ankit Navik wrote:
> drm/i915: Context aware user agnostic EU/Slice/Sub-slice control within kernel
> 
> Current GPU configuration code for i915 does not allow us to change
> EU/Slice/Sub-slice configuration dynamically. Its done only once while context
> is created.
> 
> While particular graphics application is running, if we examine the command
> requests from user space, we observe that command density is not consistent.
> It means there is scope to change the graphics configuration dynamically even
> while context is running actively. This patch series proposes the solution to
> find the active pending load for all active context at given time and based on
> that, dynamically perform graphics configuration for each context.
> 
> We use a hr (high resolution) timer with i915 driver in kernel to get a
> callback every few milliseconds (this timer value can be configured through
> debugfs, default is '0' indicating timer is in disabled state i.e. original
> system without any intervention).In the timer callback, we examine pending
> commands for a context in the queue, essentially, we intercept them before
> they are executed by GPU and we update context with required number of EUs.
> 
> Two questions, how did we arrive at right timer value? and what's the right
> number of EUs? For the prior one, empirical data to achieve best performance
> in least power was considered. For the later one, we roughly categorized number
> of EUs logically based on platform. Now we compare number of pending commands
> with a particular threshold and then set number of EUs accordingly with update
> context. That threshold is also based on experiments & findings. If GPU is able
> to catch up with CPU, typically there are no pending commands, the EU config
> would remain unchanged there. In case there are more pending commands we
> reprogram context with higher number of EUs. Please note, here we are changing
> EUs even while context is running by examining pending commands every 'x'
> milliseconds.
> 
> With this solution in place, on KBL-GT3 + Android we saw following pnp
> benefits, power numbers mentioned here are system power.
> 
> App /KPI               | % Power |
>                         | Benefit |
>                         |  (mW)   |
> ---------------------------------|
> 3D Mark (Ice storm)    | 2.30%   |
> TRex On screen         | 2.49%   |
> TRex Off screen        | 1.32%   |
> ManhattanOn screen     | 3.11%   |
> Manhattan Off screen   | 0.89%   |
> AnTuTu  6.1.4          | 3.42%   |

Were you able to find some benchmarks which regress? Maybe try Synmark2 
and more from gfxbench? Not all benchmarks there are equally important, 
and regressions on some are fine, but I think a fuller set would be 
interesting to see.

Regards,

Tvrtko

> 
> Note - For KBL (GEN9) we cannot control at sub-slice level, it was always  a
> constraint.
> We always controlled number of EUs rather than sub-slices/slices.
> 
> Praveen Diwakar (4):
>    drm/i915: Get active pending request for given context
>    drm/i915: Update render power clock state configuration for given
>      context
>    drm/i915: set optimum eu/slice/sub-slice configuration based on load
>      type
>    drm/i915: Predictive governor to control eu/slice/subslice
> 
>   drivers/gpu/drm/i915/i915_debugfs.c        | 88 +++++++++++++++++++++++++++++-
>   drivers/gpu/drm/i915/i915_drv.c            |  1 +
>   drivers/gpu/drm/i915/i915_drv.h            | 10 ++++
>   drivers/gpu/drm/i915/i915_gem_context.c    | 26 +++++++++
>   drivers/gpu/drm/i915/i915_gem_context.h    | 45 +++++++++++++++
>   drivers/gpu/drm/i915/i915_gem_execbuffer.c |  5 ++
>   drivers/gpu/drm/i915/intel_device_info.c   | 44 ++++++++++++++-
>   drivers/gpu/drm/i915/intel_lrc.c           | 20 ++++++-
>   8 files changed, 235 insertions(+), 4 deletions(-)
> 


More information about the Intel-gfx mailing list