[Intel-gfx] [PATCH 00/66] [v1] Full PPGTT minus soft pin
Ben Widawsky
ben at bwidawsk.net
Tue Jul 2 00:36:13 CEST 2013
On Mon, Jul 01, 2013 at 11:39:30PM +0200, Daniel Vetter wrote:
> Hi Ben
>
> So first things first: I rather like what the code looks like overall at
> the end. I've done a light read-through (by far not a full review) and
> besides a few bikesheds (all mentioned by mail already) the big thing is
> the 1:1 context:ppgtt address space relationship.
>
> We've discussed this at length in private irc and agreed that we need to
> changes this two a n:1 relationship, so I'll just reiterate the reasons
> for that on the list:
>
> - Current userspace expects that different contexts created on the same fd
> all use the same address space (since there's really only one). So if we
> want to no add a new ABI (and for testing I really think we want to
> enable ppgtt on current unchanged userspace) we must keep that promise.
> Hence we need to be able to point the different contexts created on an
> fd all at the same (per-fd) address space.
>
> If we want we could later on extend this and allow a context to have a
> private address space all of its own.
>
> - HW contexts are pretty much the execution/scheduling primitive the hw
> provides us with. On current platforms that's a bit a stretch, but it's
> much clearer on future platforms.
>
> The equivalent concept on the cpu would be threads, and history
> unanimously established that having multiple threads in the same process
> is a useful concept. So I think allowing the same N:1 context:ppgtt
> relation on gpus is sound. Of course that does not preclude other ways
> to share individual buffers, which we already support with
> flink/dma_buf.
>
> With that big issue resolved there's only the bikesheds left. I'm not
> really worried about those, and in any case we already have some good
> discussions going on about them.
I've discussed this with the Mesa team, and I believe this is what they
want. I'll highlight the important bit for TL;DR people:
> Hence we need to be able to point the different contexts created on an
> fd all at the same (per-fd) address space.
If one wants a new address space, they will have to open a new fd.
>
> So merge plan:
>
> Since the context<->ppgtt relation needs to be revamped I'd like to punt
> on patches in that area at first and concentrate on merging the
> address_space/vma conversion and all the prep work leading to that first.
The main idea [ack, or nak] is the ppgtt becomes *ppgtt, and the context
will refcount it on context creation/destrution - that way all the
existing tricky context refcounting should still work, and the last
context to down ref a ppgtt will destroy it in it's handler.
>
> We're already discussing some of the details, so I won't repeat that here
> again. I think for the overall approach I wouldn't bother with rebasing -
> shuffling around such a massive&invasive patch series is a major pain and
> rather error-prone.
>
> Instead I'd just freeze the current branch as a known-working reference
> point and cherry-pick individual subseries out of it. Also some patches
> will need to be redone, but thanks to the benefit of hindsight I hope that
> v2 will have much less churn. Again I've tossed around a few ideas in
> replies to individual patches.
>
> For prep work I see a few pieces:
>
> - drm_mm related prep work, especially drm_mm.c core changes. I think
> simply reordering all the relevant patches and resubmitting them (with
> cc: dri-devel) is all we need to get those in (minus the oddball last
> minute bikeshed).
>
> - Small static inline functions to ease the pain of the conversion. I
> think those need to be redone with as much foreshadowing as possible (so
> that later patches don't suffer from overblown diff sizes). Maybe we
> should also do the lookup-helpers upfront.
>
> - Other prep work like killing obj->gtt_offset and stuff I've missed (but
> which doesn't touch the context<->ppgtt relation).
I think this might be mergeable now. Did you try and have conflicts?
>
> - I think we can also merge as much of the hw interfacing code as possible
> up-front (or in paralle), e.g. converting the pde loading to LRI.
I was thinking this as well. I wasn't sure how you'd feel about the
idea. I'd really like that. This should also have no conflicts at
present (that I'm aware of).
>
> Then we can slurp all the address_space/vma conversion patches. Imo the
> important part is to fledge out the commit message with the hindsight
> insights and explain a bit why certain stuff gets moved and why other
> stuff should stay where it currently is. We also need to review whether
> bisecting isn't broken anywhere, but since we don't yet add a real ppgtt I
> don't expect any issues.
>
> Once that's all in we can revisit the context vs. ppgtt question and
> figure out how to make it work. I expect that we need to refcount ppgtt
> address spaces. But if we keep the per-fd contexts around (imo a sane idea
> anyway) we should get by when just the contexts hold references onto the
> ppgtt object. Since the context is guaranteed to stay around until the
> last object is inactive again that should ensure that the ppgtt address
> space stays around for long enough, too. And it would avoid the ugliness
> of adding more tricky active refcounting on top of the context/obj
> refcounting we already have.
>
> Comments?
The high level plan sounds totally fine to me, I still have some issues
with how to break up the huge changes in the VMA/VM conversion since
many interfaces need to change, and that gets messy. In the end I like
what I did where I split out the 6 or 7 major changes for review. Would
you be okay with that again? Maybe if all goes well, it won't be a
problem.
>
> Cheers, Daniel
>
> On Thu, Jun 27, 2013 at 04:30:01PM -0700, Ben Widawsky wrote:
> > First, I don't think this whole series is ready for merge yet. It is
> > however ready for a review, and I think a lot of the prep patches in the
> > series could be merged to make my rebasing life a bit easier. I cannot
> > continue ignoring pretty much all emails/bugs as I have for the last
> > month to wrap up this series. The current state is on my IVB, things are
> > pretty stable. I've seen one unexplained hang, but I'm hopeful review
> > might help me uncover/explain.
> >
> > This patch series introduces the next step in enabling full PPGTT, which
> > is per fd address space/context, and it also contains the previously
> > unmerged patches (some of which have been reworked, modified, or
> > rebased). In regards to the continued VMA changes, I think in these, the
> > delta with regard to the last posting is the bound list was per VM. It
> > is now global. I've also moved the active list to per VM.
> >
> > Brand new in the series we take the previous series' context per fd
> > (with address space) one step further and actually switch the address
> > spaces when we do context switches. In order to make this happen, the
> > series continues to chip away at removing the notion of an object only
> > ever being bound into one address space via the struct
> > i915_address_space and struct i915_vma data structures which are really
> > abstractions for a page directory, and current mapped ptes respectively.
> > The error state is improved since the last series (though still some
> > work there is probably needed). It also serves to remove the notion of
> > the aliasing PPGTT since in theory everything bound into the GGTT
> > shouldn't benefit from an aliasing PPGTT (fact check).
> >
> > With every context having it's own address space, and every open DRM fd
> > having it's own context, it's trivial on execbuf to lookup a context and
> > do the pinning in the proper address space. More importantly, it's
> > implicit that a context exists, which made this impossible to do
> > earlier.
> >
> > *A note on patch ordering:* In order to work this series incrementally, the
> > final patch ordering is admittedly a little bit strange. I'm more than willing
> > to rework these as requested, but I'd really prefer not to do really heavy
> > reordering unless there is a major benefit, or of course to fix bugs.
> >
> > # What is not in this patch series in the order I think we should handle it
> > (and I acknowledge some of this stuff is non-trivial):
> >
> > ## Review + QA coverage
> >
> > ## Porting to HSW
> >
> > It shouldn't be too much extra work, if any, to add support. I haven't looked
> > into it yet.
> >
> > ## Better vm/ppgtt info in error state collection
> >
> > In particular, I want to dump all the PTEs at hang, and at the very least the
> > guilt PTEs. This isn't difficult, and can be done with copypasta from the
> > existing dumper I have.
> >
> > ## User space and the implications
> >
> > Now that contexts are valid on all rings, userspace should begin emitting the
> > context for all rings if it expects both rings to be able to access both
> > objects in the same offset. The mesa change looks to be just one line which
> > simplies emits the context for batch->is_blit, I doubt libva is using contexts,
> > and SNA seems not to. The plan to support mesa will be to do the detection in
> > libdrm, and go ahead with the simple mesa one liner. I've been using the
> > oneliner if mesa for a while now, but we will need to support old user space in
> > the kernel. So there might be a bit of work even on the kernel side here. We
> > also need some IGT tools test updates. I have messy versions of these locally
> > already.
> >
> > ## Performance data
> >
> > I think it doesn't preclude preliminary review of the patches since the main
> > goal of PPGTT is really abourt security, correctness, and enabling other
> > things. I will update with some numbers after I work on it a bit more.
> >
> >
> > ## Testing on SNB
> >
> > If our current code is correct, then I think these patches might work on SNB
> > as is, but it's untested. There is currently no way to disconnect contexts +
> > PPGTT from the whole thing; so if this doesn't work - we'll need rework some of
> > the code. I think it should just entail bringing back aliasing ppgtt, and not
> > doing the address space switch when switching contexts (aliasing ppgtt will
> > have a null switch_mm()).
> >
> > ## Soft pin interface
> >
> > I'd like to defer the discussion until these patches are merged.
> >
> > ## On demand page table allocation
> >
> > This is a potentially very useful optimization for at least the following
> > reasons:
> > * any app using contexts will have an extra set of page tables it isn't using;
> > wasted memory
> > * Reduce DCLV to reduce pd fetch latency
> > * Allows Better use of a switch to default context for low memory situations
> > (evicting unused page tables for example)
> >
> > ## Per VMA cache levels/control
> >
> > There are situations in the code where we have to flush the GPU pipeline in
> > order to change cache levels. This should no longer be the case for unaffected
> > VMs (I think). The same may be true with domain tracking.
> >
> > ## dmabuf/prime integration
> >
> > I haven't looked into what's missing to support it. If I'm lucky, it just works.
> >
> >
> >
> > With that, if you haven't already moved on, chanting tl;dr - all comments
> > welcome.
> >
> > ---
> >
> > Ben Widawsky (65):
> > drm/i915: Remove extra error state NULL
> > drm/i915: Extract error buffer capture
> > drm/i915: make PDE|PTE platform specific
> > drm/i915: Don't clear gtt with 0 entries
> > drm/i915: Conditionally use guard page based on PPGTT
> > drm/i915: Use drm_mm for PPGTT PDEs
> > drm/i915: cleanup context fini
> > drm/i915: Do a fuller init after reset
> > drm/i915: Split context enabling from init
> > drm/i915: destroy i915_gem_init_global_gtt
> > drm/i915: Embed PPGTT into the context
> > drm/i915: Unify PPGTT codepaths on gen6+
> > drm/i915: Move ppgtt initialization down
> > drm/i915: Tie context to PPGTT
> > drm/i915: Really share scratch page
> > drm/i915: Combine scratch members into a struct
> > drm/i915: Drop dev from pte_encode
> > drm/i915: Use gtt shortform where possible
> > drm/i915: Move fbc members out of line
> > drm/i915: Move gtt and ppgtt under address space umbrella
> > drm/i915: Move gtt_mtrr to i915_gtt
> > drm/i915: Move stolen stuff to i915_gtt
> > drm/i915: Move aliasing_ppgtt
> > drm/i915: Put the mm in the parent address space
> > drm/i915: Move active/inactive lists to new mm
> > drm/i915: Create a global list of vms
> > drm/i915: Remove object's gtt_offset
> > drm: pre allocate node for create_block
> > drm/i915: Getter/setter for object attributes
> > drm/i915: Create VMAs (part 1)
> > drm/i915: Create VMAs (part 2) - kill gtt space
> > drm/i915: Create VMAs (part 3) - plumbing
> > drm/i915: Create VMAs (part 3.5) - map and fenceable tracking
> > drm/i915: Create VMAs (part 4) - Error capture
> > drm/i915: Create VMAs (part 5) - move mm_list
> > drm/i915: Create VMAs (part 6) - finish error plumbing
> > drm/i915: create an object_is_active()
> > drm/i915: Move active to vma
> > drm/i915: Track all VMAs per VM
> > drm/i915: Defer request freeing
> > drm/i915: Clean up VMAs before freeing
> > drm/i915: Replace has_bsd/blt with a mask
> > drm/i915: Catch missed context unref earlier
> > drm/i915: Add a context open function
> > drm/i915: Permit contexts on all rings
> > drm/i915: Fix context fini refcounts
> > drm/i915: Better reset handling for contexts
> > drm/i915: Create a per file_priv default context
> > drm/i915: Remove ring specificity from contexts
> > drm/i915: Track which ring a context ran on
> > drm/i915: dump error state based on capture
> > drm/i915: PPGTT should take a ppgtt argument
> > drm/i915: USE LRI for switching PP_DIR_BASE
> > drm/i915: Extract mm switching to function
> > drm/i915: Write PDEs at init instead of enable
> > drm/i915: Disallow pin with full ppgtt
> > drm/i915: Get context early in execbuf
> > drm/i915: Pass ctx directly to switch/hangstat
> > drm/i915: Actually add the new address spaces
> > drm/i915: Use multiple VMs
> > drm/i915: Kill now unused ppgtt_{un,}bind
> > drm/i915: Add PPGTT dumper
> > drm/i915: Dump all ppgtt
> > drm/i915: Add debugfs for vma info per vm
> > drm/i915: Getparam full ppgtt
> >
> > Chris Wilson (1):
> > drm: Optionally create mm blocks from top-to-bottom
> >
> > drivers/gpu/drm/drm_mm.c | 134 +++---
> > drivers/gpu/drm/i915/i915_debugfs.c | 215 ++++++++--
> > drivers/gpu/drm/i915/i915_dma.c | 25 +-
> > drivers/gpu/drm/i915/i915_drv.c | 57 ++-
> > drivers/gpu/drm/i915/i915_drv.h | 353 ++++++++++------
> > drivers/gpu/drm/i915/i915_gem.c | 639 +++++++++++++++++++++--------
> > drivers/gpu/drm/i915/i915_gem_context.c | 279 +++++++++----
> > drivers/gpu/drm/i915/i915_gem_debug.c | 11 +-
> > drivers/gpu/drm/i915/i915_gem_evict.c | 64 +--
> > drivers/gpu/drm/i915/i915_gem_execbuffer.c | 138 ++++---
> > drivers/gpu/drm/i915/i915_gem_gtt.c | 541 ++++++++++++++----------
> > drivers/gpu/drm/i915/i915_gem_stolen.c | 87 ++--
> > drivers/gpu/drm/i915/i915_gem_tiling.c | 21 +-
> > drivers/gpu/drm/i915/i915_irq.c | 197 ++++++---
> > drivers/gpu/drm/i915/i915_trace.h | 38 +-
> > drivers/gpu/drm/i915/intel_display.c | 40 +-
> > drivers/gpu/drm/i915/intel_drv.h | 7 -
> > drivers/gpu/drm/i915/intel_fb.c | 8 +-
> > drivers/gpu/drm/i915/intel_overlay.c | 26 +-
> > drivers/gpu/drm/i915/intel_pm.c | 65 +--
> > drivers/gpu/drm/i915/intel_ringbuffer.c | 32 +-
> > drivers/gpu/drm/i915/intel_sprite.c | 8 +-
> > include/drm/drm_mm.h | 147 ++++---
> > include/uapi/drm/i915_drm.h | 1 +
> > 24 files changed, 2044 insertions(+), 1089 deletions(-)
> >
> > --
> > 1.8.3.1
> >
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx at lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
--
Ben Widawsky, Intel Open Source Technology Center
More information about the Intel-gfx
mailing list