[Intel-gfx] [RFC 0/5] Class/instance based execbuf plus more
Tvrtko Ursulin
tursulin at ursulin.net
Mon Nov 13 13:09:04 UTC 2017
From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
Now that the engine class concept is in, it is time to re-send the old proposal
of using it for engine selection in execbuf.
Idea is primarily to fix the situation with the current VCS engine selection ABI
by introducing a new, cleaner, method of selecting the VCS engine.
Then there are two new pieces of uAPI proposal, engine capabilities and
concurrent contexts, which for instance enable the VA-API driver to let the i915
balance it's batch buffers dynamically.
This enables better utilization of resources on GT3/GT4 parts where:
a) a single stream can now use both engines
b) it opens the door of extending the i915 scheduler with more advanced
load balancing approaches to support the multiple-streams use cases better.
For instance decoding a single H.264 stream on a GT4 part is now improved from
57 seconds to 40 seconds, with minimal VA-API code base changes:
root at sc:~/ffmpeg# VA_INTEL_CONCURRENT=0 perf stat -a -e i915/vcs0-busy/,i915/vcs1-busy/ ffmpeg -loglevel panic -hwaccel vaapi -hwaccel_output_format vaapi -i ~/bbb_sunflower_1080p_60fps_normal.mp4 -f null -
Performance counter stats for 'system wide':
57,568,097,358 ns i915/vcs0-busy/
0 ns i915/vcs1-busy/
57.585753514 seconds time elapsed
root at sc:~/ffmpeg# VA_INTEL_CONCURRENT=1 perf stat -a -e i915/vcs0-busy/,i915/vcs1-busy/ ffmpeg -loglevel panic -hwaccel vaapi -hwaccel_output_format vaapi -i ~/bbb_sunflower_1080p_60fps_normal.mp4 -f null -
Performance counter stats for 'system wide':
29,152,427,164 ns i915/vcs0-busy/
29,115,272,714 ns i915/vcs1-busy/
40.733992298 seconds time elapsed
I will be sending the proof-of-concept patches for intel-vaapi-driver
separately.
Tvrtko Ursulin (5):
drm/i915: Select engines via class and instance in execbuffer2
drm/i915: Engine capabilities uAPI
drm/i915: Concurrent context uAPI
drm/i915: Re-arrange execbuf so context is known before engine
drm/i915: Per batch buffer VCS balancing
drivers/gpu/drm/i915/i915_drv.h | 7 +-
drivers/gpu/drm/i915/i915_gem.c | 2 +-
drivers/gpu/drm/i915/i915_gem_context.c | 14 +++
drivers/gpu/drm/i915/i915_gem_context.h | 20 +++++
drivers/gpu/drm/i915/i915_gem_execbuffer.c | 134 ++++++++++++++++++++++-------
drivers/gpu/drm/i915/intel_engine_cs.c | 3 +
drivers/gpu/drm/i915/intel_ringbuffer.h | 2 +
include/uapi/drm/i915_drm.h | 34 +++++++-
8 files changed, 180 insertions(+), 36 deletions(-)
--
2.14.1
More information about the Intel-gfx
mailing list