[Intel-gfx] [PULL] drm-intel-gt-next
Joonas Lahtinen
jlahtine at jlahtine-mobl.ger.corp.intel.com
Fri Sep 4 13:39:40 UTC 2020
Hi Dave & Daniel,
Here goes the GT pull request for v5.10. It's the same patches as
previously at "topic/drm-intel-gem-next", one dropped and a few
re-ordered while creating the "drm-intel-gt-next" branch. So the
patches have been part of drm-tip already for weeks.
More about the PR itself at the end, but now cutting to content:
As the log indicates, this pull req is all about the requested locking
refactoring. It ultimately ends up taking the WW locking into use across
the driver. I don't plan on sending further feature pull request
for v5.10, but let's focus on the -fixes pulls to stabilize this.
Apart from that, there's fix for Tigerlake related to syncobjs, a couple
of fixes to keep CI happy, and a code refactoring to allow for the
locking paradigm change.
GVT-g scheduler codebase is still missing the reworks. They will be done
as soon as the i915 ones get merged. But we have validated that the GVT-g
functionality still works as it's rather independent codebase.
NOTE: Includes reverts for 5 patches to faster introduce WW locking
refactoring. So those may come with some perf regressions. And major
locking refactoring probably also introduced some very subtle implicit
uAPI changes, so we'll have to deal with those as they are noticed.
Will include remaining 3 commits from drm-intel-gt-next in the PR of
next week (just -fixes stuff), when we have the full fixup for one of
them in addition to the minimal backportable fix. But feel free to
take a look at the improved commit messages already, which you requested
in the previous -fixes PR.
CI results can be found at:
https://intel-gfx-ci.01.org/tree/drm-intel-gt-next/index.html
About this PR itself: I produced this with local DIM changes to
be able to tag branches at given commit and send the PR for given tag.
Took a couple of tries, so you can disregard the extra tags until
drm-intel-gt-next-2020-09-04-2. I'll post the DIM changes for review
as RFC.
Plan is for the "drm-intel-gt-next" branch to be a persistent branch,
where the GT hardware and GEM uAPI related patches would go. I opted to
drop the -queued concept, so single tree for tagging PRs and merging.
The rebasing onto drm-next while pushing to drm-intel-gt-next also causes
DIM to complain about the committer S-o-b's. I only added S-o-b to patches
that were actually modified and noticed DIM complaint only after I had
already fixed up all the Fixes: references.
I can re-spin with added S-o-bs everywhere if that's necessary.
Regards, Joonas
***
drm-intel-gt-next-2020-09-04-3:
UAPI Changes:
(- Potential implicit changes from WW locking refactoring)
Cross-subsystem Changes:
(- WW locking changes should align the i915 locking more with others)
Driver Changes:
- MAJOR: Apply WW locking across the driver (Maarten)
- Reverts for 5 commits to make applying WW locking faster (Maarten)
- Disable preparser around invalidations on Tigerlake for non-RCS engines (Chris)
- Add missing dma_fence_put() for error case of syncobj timeline (Chris)
- Parse command buffer earlier in eb_relocate(slow) to facilitate backoff (Maarten)
- Pin engine before pinning all objects (Maarten)
- Rework intel_context pinning to do everything outside of pin_mutex (Maarten)
- Avoid tracking GEM context until registered (Cc: stable, Chris)
- Provide a fastpath for waiting on vma bindings (Chris)
- Fixes to preempt-to-busy mechanism (Chris)
- Distinguish the virtual breadcrumbs from the irq breadcrumbs (Chris)
- Switch to object allocations for page directories (Chris)
- Hold context/request reference while breadcrumbs are active (Chris)
- Make sure execbuffer always passes ww state to i915_vma_pin (Maarten)
- Code refactoring to facilitate use of WW locking (Maarten)
- Locking refactoring to use more granular locking (Maarten, Chris)
- Support for multiple pinned timelines per engine (Chris)
- Move complication of I915_GEM_THROTTLE to the ioctl from general code (Chris)
- Make active tracking/vma page-directory stash work preallocated (Chris)
- Avoid flushing submission tasklet too often (Chris)
- Reduce context termination list iteration guard to RCU (Chris)
- Reductions to locking contention (Chris)
- Fixes for issues found by CI (Chris)
The following changes since commit 3393649977f9a8847c659e282ea290d4b703295c:
Merge tag 'drm-intel-next-2020-08-24-1' of git://anongit.freedesktop.org/drm/drm-intel into drm-next (2020-08-28 14:09:31 +1000)
are available in the Git repository at:
git://anongit.freedesktop.org/drm/drm-intel tags/drm-intel-gt-next-2020-09-04-3
for you to fetch changes up to 509c5c3f0a072962260299aeab106ce27df7bb07:
drm/i915: Add ww locking to pin_to_display_plane, v2. (2020-09-03 15:35:28 +0300)
----------------------------------------------------------------
UAPI Changes:
(- Potential implicit changes from WW locking refactoring)
Cross-subsystem Changes:
(- WW locking changes should align the i915 locking more with others)
Driver Changes:
- MAJOR: Apply WW locking across the driver (Maarten)
- Reverts for 5 commits to make applying WW locking faster (Maarten)
- Disable preparser around invalidations on Tigerlake for non-RCS engines (Chris)
- Add missing dma_fence_put() for error case of syncobj timeline (Chris)
- Parse command buffer earlier in eb_relocate(slow) to facilitate backoff (Maarten)
- Pin engine before pinning all objects (Maarten)
- Rework intel_context pinning to do everything outside of pin_mutex (Maarten)
- Avoid tracking GEM context until registered (Cc: stable, Chris)
- Provide a fastpath for waiting on vma bindings (Chris)
- Fixes to preempt-to-busy mechanism (Chris)
- Distinguish the virtual breadcrumbs from the irq breadcrumbs (Chris)
- Switch to object allocations for page directories (Chris)
- Hold context/request reference while breadcrumbs are active (Chris)
- Make sure execbuffer always passes ww state to i915_vma_pin (Maarten)
- Code refactoring to facilitate use of WW locking (Maarten)
- Locking refactoring to use more granular locking (Maarten, Chris)
- Support for multiple pinned timelines per engine (Chris)
- Move complication of I915_GEM_THROTTLE to the ioctl from general code (Chris)
- Make active tracking/vma page-directory stash work preallocated (Chris)
- Avoid flushing submission tasklet too often (Chris)
- Reduce context termination list iteration guard to RCU (Chris)
- Reductions to locking contention (Chris)
- Fixes for issues found by CI (Chris)
----------------------------------------------------------------
Chris Wilson (30):
drm/i915: Reduce i915_request.lock contention for i915_request_wait
drm/i915/selftests: Mock the status_page.vma for the kernel_context
drm/i915: Soften the tasklet flush frequency before waits
drm/i915/gem: Remove disordered per-file request list for throttling
drm/i915/gt: Disable preparser around xcs invalidations on tgl
drm/i915/gt: Delay taking the spinlock for grabbing from the buffer pool
drm/i915/selftests: Flush the active barriers before asserting
drm/i915/gt: Fix termination condition for freeing all buffer objects
drm/i915/gem: Delay tracking the GEM context until it is registered
drm/i915/gt: Support multiple pinned timelines
drm/i915/gt: Pull release of node->age under the spinlock
drm/i915/selftests: Drop stale timeline constructor assert
drm/i915: Skip taking acquire mutex for no ref->active callback
drm/i915: Export a preallocate variant of i915_active_acquire()
drm/i915: Keep the most recently used active-fence upon discard
drm/i915: Make the stale cached active node available for any timeline
drm/i915: Reduce locking around i915_active_acquire_preallocate_barrier()
drm/i915: Provide a fastpath for waiting on vma bindings
drm/i915: Remove requirement for holding i915_request.lock for breadcrumbs
drm/i915/gt: Replace intel_engine_transfer_stale_breadcrumbs
drm/i915/gt: Only transfer the virtual context to the new engine if active
drm/i915/gt: Distinguish the virtual breadcrumbs from the irq breadcrumbs
drm/i915: Preallocate stashes for vma page-directories
drm/i915/gt: Switch to object allocations for page directories
drm/i915/gt: Shrink i915_page_directory's slab bucket
drm/i915/gt: Move intel_breadcrumbs_arm_irq earlier
drm/i915/gt: Hold context/request reference while breadcrumbs are active
drm/i915/selftests: Prevent selecting 0 for our random width/align
drm/i915/gem: Reduce context termination list iteration guard to RCU
drm/i915/gem: Free the fence after a fence-chain lookup failure
Maarten Lankhorst (23):
Revert "drm/i915/gem: Async GPU relocations only"
drm/i915: Revert relocation chaining commits.
Revert "drm/i915/gem: Drop relocation slowpath".
Revert "drm/i915/gem: Split eb_vma into its own allocation"
drm/i915: Add an implementation for i915_gem_ww_ctx locking, v2.
drm/i915: Remove locking from i915_gem_object_prepare_read/write
drm/i915: Parse command buffer earlier in eb_relocate(slow)
drm/i915: Use per object locking in execbuf, v12.
drm/i915: Use ww locking in intel_renderstate.
drm/i915: Add ww context handling to context_barrier_task
drm/i915: Nuke arguments to eb_pin_engine
drm/i915: Pin engine before pinning all objects, v5.
drm/i915: Rework intel_context pinning to do everything outside of pin_mutex
drm/i915: Make sure execbuffer always passes ww state to i915_vma_pin.
drm/i915: Convert i915_gem_object/client_blt.c to use ww locking as well, v2.
drm/i915: Kill last user of intel_context_create_request outside of selftests
drm/i915: Convert i915_perf to ww locking as well
drm/i915: Dirty hack to fix selftests locking inversion
drm/i915/selftests: Fix locking inversion in lrc selftest.
drm/i915: Use ww pinning for intel_context_create_request()
drm/i915: Move i915_vma_lock in the selftests to avoid lock inversion, v3.
drm/i915: Add ww locking to vm_fault_gtt
drm/i915: Add ww locking to pin_to_display_plane, v2.
drivers/gpu/drm/i915/display/intel_display.c | 6 +-
drivers/gpu/drm/i915/gem/i915_gem_client_blt.c | 89 +-
drivers/gpu/drm/i915/gem/i915_gem_context.c | 105 +-
drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 4 +-
drivers/gpu/drm/i915/gem/i915_gem_domain.c | 80 +-
drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 1601 +++++++++++++-------
drivers/gpu/drm/i915/gem/i915_gem_mman.c | 51 +-
drivers/gpu/drm/i915/gem/i915_gem_object.h | 40 +-
drivers/gpu/drm/i915/gem/i915_gem_object_blt.c | 152 +-
drivers/gpu/drm/i915/gem/i915_gem_object_blt.h | 3 +
drivers/gpu/drm/i915/gem/i915_gem_object_types.h | 10 +
drivers/gpu/drm/i915/gem/i915_gem_pm.c | 2 +-
drivers/gpu/drm/i915/gem/i915_gem_throttle.c | 67 +-
drivers/gpu/drm/i915/gem/i915_gem_tiling.c | 2 +-
drivers/gpu/drm/i915/gem/selftests/huge_pages.c | 9 +-
.../drm/i915/gem/selftests/i915_gem_client_blt.c | 2 +-
.../drm/i915/gem/selftests/i915_gem_coherency.c | 50 +-
.../gpu/drm/i915/gem/selftests/i915_gem_context.c | 144 +-
.../drm/i915/gem/selftests/i915_gem_execbuffer.c | 60 +-
drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c | 45 +-
drivers/gpu/drm/i915/gem/selftests/i915_gem_phys.c | 2 +-
drivers/gpu/drm/i915/gt/gen6_ppgtt.c | 106 +-
drivers/gpu/drm/i915/gt/gen6_ppgtt.h | 5 +-
drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 181 +--
drivers/gpu/drm/i915/gt/intel_breadcrumbs.c | 305 ++--
drivers/gpu/drm/i915/gt/intel_breadcrumbs.h | 36 +
drivers/gpu/drm/i915/gt/intel_breadcrumbs_types.h | 47 +
drivers/gpu/drm/i915/gt/intel_context.c | 309 ++--
drivers/gpu/drm/i915/gt/intel_context.h | 13 +
drivers/gpu/drm/i915/gt/intel_context_types.h | 5 +-
drivers/gpu/drm/i915/gt/intel_engine.h | 20 -
drivers/gpu/drm/i915/gt/intel_engine_cs.c | 34 +-
drivers/gpu/drm/i915/gt/intel_engine_pm.c | 3 +-
drivers/gpu/drm/i915/gt/intel_engine_types.h | 31 +-
drivers/gpu/drm/i915/gt/intel_ggtt.c | 97 +-
drivers/gpu/drm/i915/gt/intel_gt.c | 23 +-
drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c | 103 +-
.../gpu/drm/i915/gt/intel_gt_buffer_pool_types.h | 6 +-
drivers/gpu/drm/i915/gt/intel_gt_irq.c | 1 +
drivers/gpu/drm/i915/gt/intel_gtt.c | 300 +---
drivers/gpu/drm/i915/gt/intel_gtt.h | 142 +-
drivers/gpu/drm/i915/gt/intel_lrc.c | 167 +-
drivers/gpu/drm/i915/gt/intel_ppgtt.c | 150 +-
drivers/gpu/drm/i915/gt/intel_renderstate.c | 73 +-
drivers/gpu/drm/i915/gt/intel_renderstate.h | 9 +-
drivers/gpu/drm/i915/gt/intel_reset.c | 1 +
drivers/gpu/drm/i915/gt/intel_ring.c | 10 +-
drivers/gpu/drm/i915/gt/intel_ring.h | 3 +-
drivers/gpu/drm/i915/gt/intel_ring_submission.c | 42 +-
drivers/gpu/drm/i915/gt/intel_rps.c | 1 +
drivers/gpu/drm/i915/gt/intel_timeline.c | 28 +-
drivers/gpu/drm/i915/gt/intel_timeline.h | 24 +-
drivers/gpu/drm/i915/gt/intel_workarounds.c | 43 +-
drivers/gpu/drm/i915/gt/mock_engine.c | 30 +-
drivers/gpu/drm/i915/gt/selftest_context.c | 2 +
drivers/gpu/drm/i915/gt/selftest_lrc.c | 22 +-
drivers/gpu/drm/i915/gt/selftest_rps.c | 30 +-
drivers/gpu/drm/i915/gt/selftest_timeline.c | 10 +-
drivers/gpu/drm/i915/gt/selftest_workarounds.c | 2 +-
drivers/gpu/drm/i915/gt/uc/intel_guc.c | 2 +-
drivers/gpu/drm/i915/gvt/cmd_parser.c | 3 +-
drivers/gpu/drm/i915/gvt/scheduler.c | 17 +-
drivers/gpu/drm/i915/i915_active.c | 237 ++-
drivers/gpu/drm/i915/i915_active.h | 31 +-
drivers/gpu/drm/i915/i915_drv.c | 2 +-
drivers/gpu/drm/i915/i915_drv.h | 24 +-
drivers/gpu/drm/i915/i915_gem.c | 107 +-
drivers/gpu/drm/i915/i915_gem.h | 12 +
drivers/gpu/drm/i915/i915_irq.c | 1 +
drivers/gpu/drm/i915/i915_perf.c | 57 +-
drivers/gpu/drm/i915/i915_request.c | 132 +-
drivers/gpu/drm/i915/i915_request.h | 8 -
drivers/gpu/drm/i915/i915_vma.c | 65 +-
drivers/gpu/drm/i915/i915_vma.h | 13 +-
drivers/gpu/drm/i915/selftests/i915_gem.c | 41 +
drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 75 +-
drivers/gpu/drm/i915/selftests/i915_perf.c | 4 +-
drivers/gpu/drm/i915/selftests/i915_request.c | 18 +-
drivers/gpu/drm/i915/selftests/i915_vma.c | 2 +-
.../gpu/drm/i915/selftests/intel_memory_region.c | 8 +-
drivers/gpu/drm/i915/selftests/mock_gtt.c | 26 +-
81 files changed, 3654 insertions(+), 2169 deletions(-)
create mode 100644 drivers/gpu/drm/i915/gt/intel_breadcrumbs.h
create mode 100644 drivers/gpu/drm/i915/gt/intel_breadcrumbs_types.h
More information about the Intel-gfx
mailing list