[Intel-gfx] [PATCH 00/59] Remove the outstanding_lazy_request
John Harrison
John.C.Harrison at Intel.com
Fri May 29 04:00:09 PDT 2015
I am currently rebasing (yet again) onto a newer nightly tree and going
through the large number of merge conflicts. I have the anti-OLR and
fence patches rebased but am still in the middle of the scheduler ones.
I am hoping to post an updated patch set for anti-OLR at least later today.
On 28/05/2015 21:02, Jesse Barnes wrote:
> John, Tomas, where are we with this series? I believe this is a
> prerequisite for the request->fence conversion, and also the native sync
> work, both of which I need for some stuff I'm doing.
>
> Thanks,
> Jesse
>
> On 03/19/2015 05:30 AM, John.C.Harrison at Intel.com wrote:
>> From: John Harrison <John.C.Harrison at Intel.com>
>>
>> The driver tracks GPU work using request structures. Unfortunately, this
>> tracking is not currently explicit but is done by means of a catch-all request
>> that floats around in the background hoovering up work until it gets submitted.
>> This background request (ring->outstanding_lazy_request or OLR) is created at
>> the point of actually writing to the ring rather than when a particular piece of
>> GPU work is started. This scheme sort of hangs together but causes a number of
>> issues. It can mean that multiple pieces of independent work are lumped together
>> in the same request or that work is not officially submitted until much later
>> than it was created.
>>
>> This patch series completely removes the OLR and explicitly tracks each piece of
>> work with it's own personal request structure from start to submission.
>>
>> The patch set seems to fix the "'gem_ringfill --r render' + ctrl-c straight
>> after boot" issue logged as BZ:88865. I haven't done any analysis of that
>> particular issue but the descriptions I've seen appear to blame an inconsistent
>> or mangled OLR.
>>
>> Note also that by the end of this series, a number of differences between the
>> legacy and execlist code paths have been removed. For example add_request() and
>> emit_request() now have the same signature thus could be merged back to a single
>> function pointer. Merging some of these together would also allow the removal of
>> a bunch of 'if(execlists)' tests where the difference is simply to call the
>> legacy function or the execlist one.
>>
>> v2: Rebased to newer nightly tree, fixed up a few minor issues, added two extra
>> patches - one to move the LRC ring begin around in the vein of other recent
>> reshuffles, the other to clean up some issues with i915_add_request().
>>
>> v3: Large re-work due to feedback from code review. Some patches have been
>> removed, extra ones have been added and others have been changed significantly.
>> It is recommended that all patches are reviewed from scratch rather than
>> assuming only certain ones have changed and need re-inspecting. The exceptions
>> are where the 'reviewed-by' tag has been kept because that patch was not
>> significantly affected.
>>
>> [Patches against drm-intel-nightly tree fetched 18/03/2015]
>>
>> John Harrison (59):
>> drm/i915: Rename 'do_execbuf' to 'execbuf_submit'
>> drm/i915: Make intel_logical_ring_begin() static
>> drm/i915: Move common request allocation code into a common function
>> drm/i915: Fix for ringbuf space wait in LRC mode
>> drm/i915: Reserve ring buffer space for i915_add_request() commands
>> drm/i915: i915_add_request must not fail
>> drm/i915: Early alloc request in execbuff
>> drm/i915: Set context in request from creation even in legacy mode
>> drm/i915: Merged the many do_execbuf() parameters into a structure
>> drm/i915: Simplify i915_gem_execbuffer_retire_commands() parameters
>> drm/i915: Update alloc_request to return the allocated request
>> drm/i915: Add request to execbuf params and add explicit cleanup
>> drm/i915: Update the dispatch tracepoint to use params->request
>> drm/i915: Update move_to_gpu() to take a request structure
>> drm/i915: Update execbuffer_move_to_active() to take a request structure
>> drm/i915: Add flag to i915_add_request() to skip the cache flush
>> drm/i915: Update i915_gpu_idle() to manage its own request
>> drm/i915: Split i915_ppgtt_init_hw() in half - generic and per ring
>> drm/i915: Moved the for_each_ring loop outside of i915_gem_context_enable()
>> drm/i915: Don't tag kernel batches as user batches
>> drm/i915: Add explicit request management to i915_gem_init_hw()
>> drm/i915: Update ppgtt_init_ring() & context_enable() to take requests
>> drm/i915: Update i915_switch_context() to take a request structure
>> drm/i915: Update do_switch() to take a request structure
>> drm/i915: Update deferred context creation to do explicit request management
>> drm/i915: Update init_context() to take a request structure
>> drm/i915: Update render_state_init() to take a request structure
>> drm/i915: Update i915_gem_object_sync() to take a request structure
>> drm/i915: Update overlay code to do explicit request management
>> drm/i915: Update queue_flip() to take a request structure
>> drm/i915: Update add_request() to take a request structure
>> drm/i915: Update [vma|object]_move_to_active() to take request structures
>> drm/i915: Update l3_remap to take a request structure
>> drm/i915: Update mi_set_context() to take a request structure
>> drm/i915: Update a bunch of execbuffer helpers to take request structures
>> drm/i915: Update workarounds_emit() to take request structures
>> drm/i915: Update flush_all_caches() to take request structures
>> drm/i915: Update switch_mm() to take a request structure
>> drm/i915: Update ring->flush() to take a requests structure
>> drm/i915: Update some flush helpers to take request structures
>> drm/i915: Update ring->emit_flush() to take a request structure
>> drm/i915: Update ring->add_request() to take a request structure
>> drm/i915: Update ring->emit_request() to take a request structure
>> drm/i915: Update ring->dispatch_execbuffer() to take a request structure
>> drm/i915: Update ring->emit_bb_start() to take a request structure
>> drm/i915: Update ring->sync_to() to take a request structure
>> drm/i915: Update ring->signal() to take a request structure
>> drm/i915: Update cacheline_align() to take a request structure
>> drm/i915: Update intel_ring_begin() to take a request structure
>> drm/i915: Update intel_logical_ring_begin() to take a request structure
>> drm/i915: Add *_ring_begin() to request allocation
>> drm/i915: Remove the now obsolete intel_ring_get_request()
>> drm/i915: Remove the now obsolete 'outstanding_lazy_request'
>> drm/i915: Move the request/file and request/pid association to creation time
>> drm/i915: Remove fallback poll for ring buffer space
>> drm/i915: Remove 'faked' request from LRC submission
>> drm/i915: Update a bunch of LRC functions to take requests
>> drm/i915: Remove the now obsolete 'i915_gem_check_olr()'
>> drm/i915: Remove the almost obsolete i915_gem_object_flush_active()
>>
>> drivers/gpu/drm/i915/i915_drv.h | 78 ++--
>> drivers/gpu/drm/i915/i915_gem.c | 388 ++++++++++-------
>> drivers/gpu/drm/i915/i915_gem_context.c | 76 ++--
>> drivers/gpu/drm/i915/i915_gem_execbuffer.c | 126 ++++--
>> drivers/gpu/drm/i915/i915_gem_gtt.c | 61 +--
>> drivers/gpu/drm/i915/i915_gem_gtt.h | 3 +-
>> drivers/gpu/drm/i915/i915_gem_render_state.c | 15 +-
>> drivers/gpu/drm/i915/i915_gem_render_state.h | 2 +-
>> drivers/gpu/drm/i915/i915_trace.h | 28 +-
>> drivers/gpu/drm/i915/intel_display.c | 62 +--
>> drivers/gpu/drm/i915/intel_drv.h | 3 +-
>> drivers/gpu/drm/i915/intel_fbdev.c | 2 +-
>> drivers/gpu/drm/i915/intel_lrc.c | 585 +++++++++++---------------
>> drivers/gpu/drm/i915/intel_lrc.h | 17 +-
>> drivers/gpu/drm/i915/intel_overlay.c | 64 ++-
>> drivers/gpu/drm/i915/intel_ringbuffer.c | 374 ++++++++--------
>> drivers/gpu/drm/i915/intel_ringbuffer.h | 54 ++-
>> 17 files changed, 1005 insertions(+), 933 deletions(-)
>>
More information about the Intel-gfx
mailing list