[Intel-gfx] [PATCH v2 0/4] Dynamic EU configuration of Slice/Subslice/EU.
Navik, Ankit P
ankit.p.navik at intel.com
Tue Dec 11 09:58:46 UTC 2018
Hi Tvrtko,
> On Wed, Nov 7, 2018 at 4:08 PM Tvrtko Ursulin <tvrtko.ursulin at linux.intel.com> wrote:
>
>
> On 06/11/2018 04:13, Ankit Navik wrote:
> > drm/i915: Context aware user agnostic EU/Slice/Sub-slice control
> > within kernel
> >
> > Current GPU configuration code for i915 does not allow us to change
> > EU/Slice/Sub-slice configuration dynamically. Its done only once while
> > context is created.
> >
> > While particular graphics application is running, if we examine the
> > command requests from user space, we observe that command density is not
> consistent.
> > It means there is scope to change the graphics configuration
> > dynamically even while context is running actively. This patch series
> > proposes the solution to find the active pending load for all active
> > context at given time and based on that, dynamically perform graphics
> configuration for each context.
> >
> > We use a hr (high resolution) timer with i915 driver in kernel to get
> > a callback every few milliseconds (this timer value can be configured
> > through debugfs, default is '0' indicating timer is in disabled state
> > i.e. original system without any intervention).In the timer callback,
> > we examine pending commands for a context in the queue, essentially,
> > we intercept them before they are executed by GPU and we update context
> with required number of EUs.
> >
> > Two questions, how did we arrive at right timer value? and what's the
> > right number of EUs? For the prior one, empirical data to achieve best
> > performance in least power was considered. For the later one, we
> > roughly categorized number of EUs logically based on platform. Now we
> > compare number of pending commands with a particular threshold and
> > then set number of EUs accordingly with update context. That threshold
> > is also based on experiments & findings. If GPU is able to catch up
> > with CPU, typically there are no pending commands, the EU config would
> > remain unchanged there. In case there are more pending commands we
> > reprogram context with higher number of EUs. Please note, here we are
> changing EUs even while context is running by examining pending commands
> every 'x'
> > milliseconds.
> >
> > With this solution in place, on KBL-GT3 + Android we saw following pnp
> > benefits, power numbers mentioned here are system power.
> >
> > App /KPI | % Power |
> > | Benefit |
> > | (mW) |
> > ---------------------------------|
> > 3D Mark (Ice storm) | 2.30% |
> > TRex On screen | 2.49% |
> > TRex Off screen | 1.32% |
> > ManhattanOn screen | 3.11% |
> > Manhattan Off screen | 0.89% |
> > AnTuTu 6.1.4 | 3.42% |
>
> Were you able to find some benchmarks which regress? Maybe try Synmark2
> and more from gfxbench? Not all benchmarks there are equally important, and
> regressions on some are fine, but I think a fuller set would be interesting to see.
We have not seen much improvement in GFX Carchase, but there was no degradation in performance.
Regards, Ankit
>
> Regards,
>
> Tvrtko
>
> >
> > Note - For KBL (GEN9) we cannot control at sub-slice level, it was
> > always a constraint.
> > We always controlled number of EUs rather than sub-slices/slices.
> >
> > Praveen Diwakar (4):
> > drm/i915: Get active pending request for given context
> > drm/i915: Update render power clock state configuration for given
> > context
> > drm/i915: set optimum eu/slice/sub-slice configuration based on load
> > type
> > drm/i915: Predictive governor to control eu/slice/subslice
> >
> > drivers/gpu/drm/i915/i915_debugfs.c | 88
> +++++++++++++++++++++++++++++-
> > drivers/gpu/drm/i915/i915_drv.c | 1 +
> > drivers/gpu/drm/i915/i915_drv.h | 10 ++++
> > drivers/gpu/drm/i915/i915_gem_context.c | 26 +++++++++
> > drivers/gpu/drm/i915/i915_gem_context.h | 45 +++++++++++++++
> > drivers/gpu/drm/i915/i915_gem_execbuffer.c | 5 ++
> > drivers/gpu/drm/i915/intel_device_info.c | 44 ++++++++++++++-
> > drivers/gpu/drm/i915/intel_lrc.c | 20 ++++++-
> > 8 files changed, 235 insertions(+), 4 deletions(-)
> >
More information about the Intel-gfx
mailing list