[Mesa-dev] [PATCH 00/10] gallium: resequencer layer

Rob Clark robdclark at gmail.com
Tue Jun 14 15:57:52 UTC 2016


From: Rob Clark <robclark at freedesktop.org>

So, I know there were a couple concerns voiced over the idea of
re-ordering rendering in a gallium shim pipe driver layer.  For
me, the main concern was whether the overhead of an extra layer,
queueing and replaying state updates, draws, etc, would be
prohibitive.  So I implemented it enough that I could do some
benchmarking ;-)

The first 9 patches are just some general API cleanups, which I
found to be convenient (since the resequencer layer is generating
most of the state handling with python + mako, so the cleanups to
improve consistency help minimize the state which required special
handling).  But regardless of the outcome of the resequencer
layer, I think these patches make sense on their own.

(Note: auto-generating some of the other wrapper layers might be
an interesting future cleanup..  at least it should be trivial
for noop ;-))

As far as overhead, I've been benchmarking (most glmark2 + stk +
gfxbench), and in the current state (without actually having the
dependency tracking implemented) it doesn't seem to cause more
than a couple percent overhead.  From here on out, the remaining
overhead added to implement the dependency tracking and re-
ordering would be the same as the additional overhead required
to implement it in the driver backend.

And a couple percent overhead is small compared to the expected
gains for games which benefit.. ie. 8MiB for 1080p rgb frame,
avoiding copying that from tile to memory and back once or twice
quickly dwarfs an extra copy of some 10's of kb of state.. and
even more so for (for ex.) f32f32f32f32 intermediate buffers.

Queries are still missing, but I expect what would be required
to implement it is the same as the logic that would be needed in
the driver backend otherwise.

Basically, the only concern I have, compared to the approach of
implementing the dependency tracking in each driver backend is
pipe_constant_buffer::user_buffer.  Currently both freedreno and
vc4 what non-UBO constant buffers to be emitted in cmdstream.
In the adreno case, it looks like a3xx/a4xx should also support
the non-user_buffer case, although in fact this appears to be
broken (at least on a4xx) and I've never seen blob driver use
this.  At the moment I'm doing a hack in freedreno to map the
backing fd_bo and then memcpy it into cmdstream.  Which is a
bit silly (since it is a write-combine buffer I'm copying from).
But in glmark I had trouble even measuring the overhead of this
extra copy.  Although possibly I need to find something to
measure which emits more non-UBO constant state.

btw, if someone has some requests for benchmarks to try (provided
they are available for arm/linux) I'd be happy to try some other
things.

The plus side of doing this in a separate layer is that we only
implement the dependency tracking and resource shadowing once,
instead of both in vc4 and freedreno (and who knows, maybe
someday someone gets around to writing a lima gallium driver).
Plus, I envision this to be something that mesa/st wraps the
pipe_screen with if driconf tells it to, and pscreen->rsq_funcs
is populated (we at least need a callback to know if resource
is still busy).  This way we can turn it on for games/apps that
are known to benefit, and leave it off with zero additional
overhead for better written things (or rather, things written
with tilers in mind).


Rob Clark (10):
  gallium: cleanup set_tess_state
  gallium: make shader_buffers const
  gallium: make constant_buffer const
  gallium: make image_view const
  gallium: change end_query() to return boolean
  gallium/util: add util_copy_index_buffer() helper
  gallium/util: add util_copy_shader_buffer() helper
  gallium/util: add util_copy_vertex_buffer helper
  gallium/util: make util_copy_framebuffer_state(src=NULL) work
  RFC: gallium: add resequencer driver (INCOMPLETE)

 configure.ac                                       |   1 +
 src/gallium/auxiliary/util/u_framebuffer.c         |  37 +-
 src/gallium/auxiliary/util/u_helpers.c             |  15 -
 src/gallium/auxiliary/util/u_helpers.h             |   3 -
 src/gallium/auxiliary/util/u_inlines.h             |  49 ++
 src/gallium/drivers/ddebug/dd_context.c            |  15 +-
 src/gallium/drivers/freedreno/freedreno_query.c    |   2 +-
 src/gallium/drivers/freedreno/freedreno_state.c    |  13 +-
 src/gallium/drivers/i915/i915_query.c              |   2 +-
 src/gallium/drivers/i915/i915_state.c              |   8 +-
 src/gallium/drivers/ilo/ilo_query.c                |   2 +-
 src/gallium/drivers/ilo/ilo_state.c                |  14 +-
 src/gallium/drivers/llvmpipe/lp_query.c            |   2 +-
 src/gallium/drivers/llvmpipe/lp_state_fs.c         |   2 +-
 src/gallium/drivers/llvmpipe/lp_state_vertex.c     |   6 +-
 src/gallium/drivers/noop/noop_pipe.c               |   2 +-
 src/gallium/drivers/noop/noop_state.c              |   2 +-
 src/gallium/drivers/nouveau/nv30/nv30_query.c      |   2 +-
 src/gallium/drivers/nouveau/nv30/nv30_state.c      |  13 +-
 src/gallium/drivers/nouveau/nv50/nv50_query.c      |   2 +-
 src/gallium/drivers/nouveau/nv50/nv50_state.c      |   2 +-
 src/gallium/drivers/nouveau/nvc0/nvc0_state.c      |  25 +-
 src/gallium/drivers/r300/r300_query.c              |   4 +-
 src/gallium/drivers/r300/r300_state.c              |  10 +-
 src/gallium/drivers/r600/evergreen_state.c         |   7 +-
 src/gallium/drivers/r600/r600_state_common.c       |   7 +-
 src/gallium/drivers/radeon/r600_query.c            |   2 +-
 src/gallium/drivers/radeonsi/si_descriptors.c      |  16 +-
 src/gallium/drivers/radeonsi/si_state.c            |  13 +-
 src/gallium/drivers/radeonsi/si_state.h            |   3 +-
 src/gallium/drivers/rbug/rbug_context.c            |   6 +-
 src/gallium/drivers/resequencer/.gitignore         |   2 +
 src/gallium/drivers/resequencer/Makefile.am        |  44 ++
 src/gallium/drivers/resequencer/Makefile.sources   |  23 +
 src/gallium/drivers/resequencer/rsq_batch.c        | 144 +++++
 src/gallium/drivers/resequencer/rsq_batch.h        |  71 +++
 src/gallium/drivers/resequencer/rsq_context.c      | 457 ++++++++++++++++
 src/gallium/drivers/resequencer/rsq_context.h      |  84 +++
 src/gallium/drivers/resequencer/rsq_draw.c         | 230 ++++++++
 src/gallium/drivers/resequencer/rsq_draw.h         |  40 ++
 src/gallium/drivers/resequencer/rsq_fence.c        |  48 ++
 src/gallium/drivers/resequencer/rsq_fence.h        |  43 ++
 src/gallium/drivers/resequencer/rsq_public.h       |  68 +++
 src/gallium/drivers/resequencer/rsq_query.c        | 148 +++++
 src/gallium/drivers/resequencer/rsq_query.h        |  32 ++
 src/gallium/drivers/resequencer/rsq_resource.c     | 222 ++++++++
 src/gallium/drivers/resequencer/rsq_resource.h     |  60 ++
 src/gallium/drivers/resequencer/rsq_screen.c       | 186 +++++++
 src/gallium/drivers/resequencer/rsq_screen.h       |  50 ++
 src/gallium/drivers/resequencer/rsq_state.py       | 607 +++++++++++++++++++++
 .../drivers/resequencer/rsq_state_helpers.h        | 219 ++++++++
 src/gallium/drivers/resequencer/rsq_surface.c      | 107 ++++
 src/gallium/drivers/resequencer/rsq_surface.h      |  72 +++
 src/gallium/drivers/softpipe/sp_query.c            |   2 +-
 src/gallium/drivers/softpipe/sp_state_image.c      |  10 +-
 src/gallium/drivers/softpipe/sp_state_shader.c     |   2 +-
 src/gallium/drivers/softpipe/sp_state_vertex.c     |   6 +-
 src/gallium/drivers/svga/svga_pipe_constants.c     |   2 +-
 src/gallium/drivers/svga/svga_pipe_query.c         |   2 +-
 src/gallium/drivers/svga/svga_pipe_vertex.c        |   2 +-
 src/gallium/drivers/swr/swr_query.cpp              |   2 +-
 src/gallium/drivers/swr/swr_state.cpp              |   9 +-
 src/gallium/drivers/trace/tr_context.c             |  15 +-
 src/gallium/drivers/vc4/vc4_query.c                |   2 +-
 src/gallium/drivers/vc4/vc4_state.c                |  13 +-
 src/gallium/drivers/virgl/virgl_context.c          |  10 +-
 src/gallium/drivers/virgl/virgl_query.c            |   4 +-
 src/gallium/include/pipe/p_context.h               |  12 +-
 src/gallium/include/pipe/p_state.h                 |   8 +
 src/mesa/state_tracker/st_atom_tess.c              |  13 +-
 70 files changed, 3148 insertions(+), 210 deletions(-)
 create mode 100644 src/gallium/drivers/resequencer/.gitignore
 create mode 100644 src/gallium/drivers/resequencer/Makefile.am
 create mode 100644 src/gallium/drivers/resequencer/Makefile.sources
 create mode 100644 src/gallium/drivers/resequencer/rsq_batch.c
 create mode 100644 src/gallium/drivers/resequencer/rsq_batch.h
 create mode 100644 src/gallium/drivers/resequencer/rsq_context.c
 create mode 100644 src/gallium/drivers/resequencer/rsq_context.h
 create mode 100644 src/gallium/drivers/resequencer/rsq_draw.c
 create mode 100644 src/gallium/drivers/resequencer/rsq_draw.h
 create mode 100644 src/gallium/drivers/resequencer/rsq_fence.c
 create mode 100644 src/gallium/drivers/resequencer/rsq_fence.h
 create mode 100644 src/gallium/drivers/resequencer/rsq_public.h
 create mode 100644 src/gallium/drivers/resequencer/rsq_query.c
 create mode 100644 src/gallium/drivers/resequencer/rsq_query.h
 create mode 100644 src/gallium/drivers/resequencer/rsq_resource.c
 create mode 100644 src/gallium/drivers/resequencer/rsq_resource.h
 create mode 100644 src/gallium/drivers/resequencer/rsq_screen.c
 create mode 100644 src/gallium/drivers/resequencer/rsq_screen.h
 create mode 100644 src/gallium/drivers/resequencer/rsq_state.py
 create mode 100644 src/gallium/drivers/resequencer/rsq_state_helpers.h
 create mode 100644 src/gallium/drivers/resequencer/rsq_surface.c
 create mode 100644 src/gallium/drivers/resequencer/rsq_surface.h

-- 
2.5.5



More information about the mesa-dev mailing list