[Mesa-dev] [RFC] Enable Resource Streamer on Haswell
abdiel.janulgue at linux.intel.com
Mon Jul 8 06:16:51 PDT 2013
The following RFC patchset initially enables the resource streamer on
We can think of the resource streamer as a command streamer accelerator:
It accelerates certain commands that would normally take time to build-up
and submit to the GPU; hence reducing some of the overhead associated with
such commands. In Haswell, generating binding tables and constant buffers
can be offloaded from being CPU-generated commands to the resource streamer.
This is a preparatory patchset that initially enables hardware-generated
binding tables - which is primarily required to enable RS-based
optimizations e.g.constant buffer generation and other ways to reduce
command buffer submissions. This initial patch is closely modeled after
the current model of how the i965 driver generates binding tables (see section
below for possible future optimization). Though it shaved off a few
microseconds off CPU cycles for every command submission, I don't expect
it at its current form to produce wide margins in performance gains.
The changes improved GLB 2.5 by 0.19% n=14.
In hw-generated binding tables case, the RS basically sits in front
of the CS watching for the [VS/PS]BINDING_TABLE_POINTERS commands. Once
RS encounters it, it flushes the state of the on-die binding table entries
to a buffer object, where the CS picks it up afterwards. Each surface state
and it's associated index in the on-die binding table state can be edited
directly instead of generating the entire binding table array in one go.
One optimization idea that we can possibly implement in the future is to
use the RS to publish deltas of changed surface states so that we
wouldn't have to rebuild entire binding tables for every batch buffer
flush. Currently our VS/PS surface states are appended at the end of our
batchbuffer in the i965 driver. For every batchbuffer flush, the VS/PS
surface states and binding tables are rebuilt everytime for every change.
With the RS in mind, it would be possible to use a separate larger
batchbuffer for (permanent?) surface state objects so the generated
surface state offsets would change less often .
With this series, GLB works fine and most piglit tests pass but some
random GPU lockups may occur when piglit is run over a period of time.
intel_error_decode does not specifically say where in the batch the problem
points to. I'll spend some time in nailing down this issue in the
In the intel-gfx list, I'll post the libdrm and kernel portions that enables
the RS-bits on MI_BATCH_BUFFER_START.
 Needs changes in libdrm aperture checks to accomodate multiple levels of relocation
Abdiel Janulgue (12):
intel: Add resource streamer control defines
intel: On Haswell hardware, enable the resource streamer on batchbuffer start
i965: Temporarily disable resource streamer when state base address is updated.
i965: Add MI_RS_STORE_DATA_IMM workaround for 3DPRIMITIVE commands
i965: Switch on hardware-generated binding tables.
i965: Implement opcodes for the hw-generated binding table EDIT commands
i965: Use hw-bt for pull constants and VS UBO surface states.
i965: Use hw-bt for renderbuffer, constant, and texture surface states.
i965: Flush on-chip binding table to pool
i965: Use hw-bt for generated WM UBO surface states.
i965/blorp: In blorp, update PS on-chip binding table when new surface state entries are generated
i965/blorp: Add temporary work-around due to b607d57630daa7d92a84c41abfd45cacbe63f3d2
src/mesa/drivers/dri/i965/brw_context.c | 2 ++
src/mesa/drivers/dri/i965/brw_context.h | 1 +
src/mesa/drivers/dri/i965/brw_defines.h | 9 ++++++
src/mesa/drivers/dri/i965/brw_draw.c | 14 +++++++++
src/mesa/drivers/dri/i965/brw_misc_state.c | 7 +++++
src/mesa/drivers/dri/i965/brw_state.h | 13 +++++++++
src/mesa/drivers/dri/i965/brw_state_upload.c | 3 ++
src/mesa/drivers/dri/i965/brw_vs_surface_state.c | 14 +++++++++
src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 9 ++++++
src/mesa/drivers/dri/i965/gen6_blorp.cpp | 27 ++++++++++++++++-
src/mesa/drivers/dri/i965/gen7_blorp.cpp | 3 +-
src/mesa/drivers/dri/i965/gen7_misc_state.c | 109 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
src/mesa/drivers/dri/i965/gen7_vs_state.c | 10 ++++---
src/mesa/drivers/dri/i965/gen7_wm_state.c | 10 ++++---
src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 36 +++++++++++++++++++----
src/mesa/drivers/dri/i965/intel_batchbuffer.c | 3 ++
src/mesa/drivers/dri/i965/intel_reg.h | 4 +++
17 files changed, 259 insertions(+), 15 deletions(-)
More information about the mesa-dev