[Mesa-dev] [PATCH 04/18] i965: Introduce a context-local batch manager
kenneth at whitecape.org
Tue Jul 7 22:03:09 PDT 2015
I made a genuine effort to review this patch, hoping to better understand
the various changes and what you were trying to accomplish. I spent many
hours reading and trying to enumerate changes - or potential changes I
needed to look hard at to convince myself whether they were correct.
I came up with a frighteningly long list of changes:
* Relocation handling changes considerably (the original point of
Kristian's endeavour which led up to this).
* Fencing, busy tracking, and sync objects are completely reworked.
* Render-to-texture cache flushing and dirty buffer tracking is
* Gen7 SOL buffer offset resetting now uses MI_LOAD_REGISTER_IMM rather
than the execbuf2 parameter, requiring the command validator on Haswell.
This effectively bumps the kernel requirement from v3.6 to v4.2-rc1,
which will simply not fly with distributions at this time.
* glBufferSubData() now uses intel_upload_data() rather than allocating
a temporary BO. This is the first use of the upload buffer by the
BLT engine, and could imply that the upload buffer's lifetime now
extends across batches - longer than before. Separable change that
requires separate evaluation and justification.
* Per buffer cache-coherency checking rather than brw->has_llc?
* glBufferSubData()'s prefer_stall_to_blit flag appears to depend on
per-buffer cache-coherency rather than being set globally. Could
impact performance of buffer uploads.
* Potential missing flushes (which can cause hangs or misrendering):
- It looks like calling brw_bo_busy() with BUSY_FLUSH causes a flush
when necessary. However, some instances of the old bo_busy,
bo_references, batch_flush pattern are replaced without that flag.
One occurrance was in BufferSubData(); I did not spend time to
check every case.
- Flushes are often done implicitly by e.g. brw_bo_read calling
brw_bo_map with the appropriate flags, and many explicit checks
and flushes are removed. Not bad, but needs careful review.
- Gen6+ query object code might have dropped an implicit flush
guaranteeing that when the GL application requests the result,
any pending work will be kicked off so they can poll/spin
repeatedly until the result arrives.
- New code to avoid redundant flushes.
* perf_debug() warnings are removed all over the code for some reason:
- Unsynchronized maps/BufferSubData not working on !LLC platforms?
If they work now, that's a huge change! If not, why drop the warning?
- Warnings about stalls on mapping buffers and miptrees are gone now.
These have been useful in tracking down performance problems. They
might not always be accurate, but surely removing them should be done
separately with justification?
- Warnings about stalls on query objects are gone. I've used these when
analyzing application performance. Why?
- Warnings about implicit flushes are gone.
* BO unmap calls appears to be missing in some places. A few map calls
have moved around in hard-to-follow ways. Unclear how lifetimes of
buffers and lifetimes of maps are affected.
* Possible mmap vs. pwrite preference changes? Hard to follow.
* Texture upload (tiled_memcpy) changes, which is notoriously fragile
and can lose all of the performance benefit if the compiler isn't able
to optimize it just right. Ideally separate.
* Assertions change to GL errors in brw_get_graphics_reset_status().
* Aperture space checking significantly reworked, especially for the BLT
paths. Honestly, a lot nicer, but couldn't this be separated?
* The bo_reuse driconf option is removed.
* Gen4-5 structure changes.
* brw_get_timestamp() - removes initialization of result to 0.
Probably unnecessary and OK to delete; should be separate.
* New helper functions and coding patterns. Separable.
* Noise (renaming, moving code between files, some other trivial changes
like removing 'brw' variables and moving code into "else" blocks).
* ...I probably missed some things.
Based upon this, I cannot in good conscience consider merging this patch.
The potential for breakage is staggering. As a proof-of-concept, you've
done an excellent job in proving we can do much better, and introduced a
lot of good ideas. But there's a lot of work left to be done before we
can consider applying it to our production quality driver.
Please advise whether you would like to work towards making a mergeable,
incremental patch series, or if someone else should embark on that
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 819 bytes
Desc: This is a digitally signed message part.
More information about the mesa-dev