[Mesa-dev] [PATCH 04/18] i965: Introduce a context-local batch manager

Kenneth Graunke kenneth at whitecape.org
Tue Jul 7 22:03:09 PDT 2015

Hi Chris,

I made a genuine effort to review this patch, hoping to better understand
the various changes and what you were trying to accomplish.  I spent many
hours reading and trying to enumerate changes - or potential changes I
needed to look hard at to convince myself whether they were correct.

I came up with a frighteningly long list of changes:

* Relocation handling changes considerably (the original point of
  Kristian's endeavour which led up to this).

* Fencing, busy tracking, and sync objects are completely reworked.

* Render-to-texture cache flushing and dirty buffer tracking is
  completely reworked.

* Gen7 SOL buffer offset resetting now uses MI_LOAD_REGISTER_IMM rather
  than the execbuf2 parameter, requiring the command validator on Haswell.
  This effectively bumps the kernel requirement from v3.6 to v4.2-rc1,
  which will simply not fly with distributions at this time.

* glBufferSubData() now uses intel_upload_data() rather than allocating
  a temporary BO.  This is the first use of the upload buffer by the
  BLT engine, and could imply that the upload buffer's lifetime now
  extends across batches - longer than before.  Separable change that
  requires separate evaluation and justification.

* Per buffer cache-coherency checking rather than brw->has_llc?

* glBufferSubData()'s prefer_stall_to_blit flag appears to depend on
  per-buffer cache-coherency rather than being set globally.  Could
  impact performance of buffer uploads.

* Potential missing flushes (which can cause hangs or misrendering):

  - It looks like calling brw_bo_busy() with BUSY_FLUSH causes a flush
    when necessary.  However, some instances of the old bo_busy,
    bo_references, batch_flush pattern are replaced without that flag.
    One occurrance was in BufferSubData(); I did not spend time to
    check every case.

  - Flushes are often done implicitly by e.g. brw_bo_read calling
    brw_bo_map with the appropriate flags, and many explicit checks
    and flushes are removed.  Not bad, but needs careful review.

  - Gen6+ query object code might have dropped an implicit flush
    guaranteeing that when the GL application requests the result,
    any pending work will be kicked off so they can poll/spin
    repeatedly until the result arrives.

  - New code to avoid redundant flushes.

* perf_debug() warnings are removed all over the code for some reason:

  - Unsynchronized maps/BufferSubData not working on !LLC platforms?
    If they work now, that's a huge change!  If not, why drop the warning?

  - Warnings about stalls on mapping buffers and miptrees are gone now.
    These have been useful in tracking down performance problems.  They
    might not always be accurate, but surely removing them should be done
    separately with justification?

  - Warnings about stalls on query objects are gone.  I've used these when
    analyzing application performance.  Why?

  - Warnings about implicit flushes are gone.

* BO unmap calls appears to be missing in some places.  A few map calls
  have moved around in hard-to-follow ways.  Unclear how lifetimes of
  buffers and lifetimes of maps are affected.

* Possible mmap vs. pwrite preference changes?  Hard to follow.

* Texture upload (tiled_memcpy) changes, which is notoriously fragile
  and can lose all of the performance benefit if the compiler isn't able
  to optimize it just right.  Ideally separate.

* Assertions change to GL errors in brw_get_graphics_reset_status().

* Aperture space checking significantly reworked, especially for the BLT
  paths.  Honestly, a lot nicer, but couldn't this be separated?

* The bo_reuse driconf option is removed.

* Gen4-5 structure changes.

* brw_get_timestamp() - removes initialization of result to 0.
  Probably unnecessary and OK to delete; should be separate.

* New helper functions and coding patterns.  Separable.

* Noise (renaming, moving code between files, some other trivial changes
  like removing 'brw' variables and moving code into "else" blocks).

* ...I probably missed some things.

Based upon this, I cannot in good conscience consider merging this patch.
The potential for breakage is staggering.  As a proof-of-concept, you've
done an excellent job in proving we can do much better, and introduced a
lot of good ideas.  But there's a lot of work left to be done before we
can consider applying it to our production quality driver.

Please advise whether you would like to work towards making a mergeable,
incremental patch series, or if someone else should embark on that

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20150707/da218978/attachment.sig>

More information about the mesa-dev mailing list