[Mesa-dev] i965: On-demand render target flushing

Pohjolainen, Topi topi.pohjolainen at gmail.com
Wed Mar 29 05:53:33 UTC 2017


Jason, Curro, do you have any opinion if this is worth pursuing?
I need something for blorp blits at least - using blorp for texture
uploads on top of current excessive flushing regresses perf.

When working on gpu hangs on SKL we also identified compute
flushing caches it shouldn't. I think those could be addressed
nicely here as well. I can respin if this is something we'd like
to have.

On Fri, Feb 17, 2017 at 09:32:03PM +0200, Topi Pohjolainen wrote:
> Currently:
> 
> 1) Blorp color clears and resolves emit unconditional render target
>    flush + command stream after every clear/resolve (including
>    regular non-fast clears).
> 
> 2) Blorp color clears, resolves and blits emit texture and constant
>    cache resolves even in case only destination is dirty. This is
>    because brw_render_cache_set_check_flush() does both render target
>    flush as well as the top-of-pipe read cache flushes.
> 
> 3) Similarly to item 2, 3D and compute paths also flush texture and
>    constant caches even if none of the texture surfaces are dirty.
> 
> 4) In case of multiple surfaces needing resolves, all render paths
>    (blorp, 3D and compute) emit render target, texture and constant
>    cache flushes after each resolve instead of just once after all
>    resolves.
> 
> This series addresses all four cases. Good news are that even though
> the current setup isn't optimal, it doesn't actually get any better in
> most cases performance wise. There is modest gain in OglDrvRes which
> does heavy blorp blitting. I'm expecting this series also to make
> blorp tex uploads and blorp mipmap generation more competitive.
> 
> Bad news are in the final patch - it looks that current unconditional
> flushing/stalling has been hiding bugs elsewhere. There are cases
> which rely on the flushes after non-fast clears. Hunting the real
> cause is, however, difficult. I only saw them in CI system within
> full runs and was not able to reproduce them myself.
> 
> As first steps the series introduces end-of-pipe synchronization.
> This is a flush combined with stall and post-sync operation of
> writing a double word (32 bits). Until now this wasn't really
> needed as there was in many cases double flushing which in turn
> looks to take long enough to hide the need for the sync. I also
> noticed that one needs to be rather careful with it - performance
> gets decreased noticeably when used unneeded.
> 
> I don't really know if we want to go this way myself even. Current
> logic - while not ideal - is rather simple.
> 
> Topi Pohjolainen (16):
>   i965/miptree: Tell if anything got resolved
>   i965/gen6+: Implement end-of-pipe sync
>   i965: Hook end-of-pipe-sync after texture resolves
>   i965: Hook end-of-pipe-sync after image resolves
>   i965: Hook end-of-pipe-sync after framebuffer resolves
>   i965: Consider layered rt resolves along with other
>   i965: Add color resolve end-of-pipe-sync before switch to blit ring
>   i965/dri2: Add end-of-pipe-sync after color resolves
>   i965/miptree: Add color resolve end-of-pipe-sync before sharing
>   i965: Add end-of-pipe sync before non-gpu read of color resolves
>   i965/blorp: Do more fine grained flushing/syncing
>   i965/blorp/blit: Refactor hiz/ccs prep for blits
>   i965/blorp: Use conditional end-of-pipe-sync
>   i965: Consider surface resolves and sync after blorp ops
>   i965: Check if fast color clear state transition needs sync
>   i965/blorp: Drop unnecessary flushes after clear/resolve
> 
>  src/mesa/drivers/dri/i965/brw_blorp.c          | 187 ++++++++++----
>  src/mesa/drivers/dri/i965/brw_compute.c        |   2 +
>  src/mesa/drivers/dri/i965/brw_context.c        | 333 +++++++++++++++++++------
>  src/mesa/drivers/dri/i965/brw_context.h        |   3 +
>  src/mesa/drivers/dri/i965/brw_draw.c           |  36 +--
>  src/mesa/drivers/dri/i965/brw_pipe_control.c   |  91 +++++++
>  src/mesa/drivers/dri/i965/genX_blorp_exec.c    |  11 -
>  src/mesa/drivers/dri/i965/intel_blit.c         |  16 +-
>  src/mesa/drivers/dri/i965/intel_copy_image.c   |  10 +-
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c  |  25 +-
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.h  |   2 +-
>  src/mesa/drivers/dri/i965/intel_pixel.c        |   4 +
>  src/mesa/drivers/dri/i965/intel_pixel_bitmap.c |   5 +-
>  src/mesa/drivers/dri/i965/intel_pixel_read.c   |   7 +-
>  src/mesa/drivers/dri/i965/intel_tex_image.c    |  10 +-
>  src/mesa/drivers/dri/i965/intel_tex_subimage.c |  11 +-
>  16 files changed, 557 insertions(+), 196 deletions(-)
> 
> -- 
> 2.5.5
> 


More information about the mesa-dev mailing list