[Intel-gfx] [PATCH] drm/i915: Fix i915_gem_evict_for_vma (soft-pinning)
Chris Wilson
chris at chris-wilson.co.uk
Wed Nov 16 14:55:59 UTC 2016
On Wed, Nov 16, 2016 at 03:41:53PM +0100, Daniel Vetter wrote:
> On Wed, Nov 16, 2016 at 08:58:30AM +0000, Chris Wilson wrote:
> > Soft-pinning depends upon being able to check for availabilty of an
> > interval and evict overlapping object from a drm_mm range manager very
> > quickly. Currently it uses a linear list, and so performance is dire and
> > not suitable as a general replacement. Worse, the current code will oops
> > if it tries to evict an active buffer.
> >
> > It also helps if the routine reports the correct error codes as expected
> > by its callers and emits a tracepoint upon use.
> >
> > For posterity since the wrong patch was pushed (i.e. that missed these
> > key points and had known bugs), this is the changelog that should have
> > been on commit 506a8e87d8d2 ("drm/i915: Add soft-pinning API for
> > execbuffer"):
> >
> > Userspace can pass in an offset that it presumes the object is located
> > at. The kernel will then do its utmost to fit the object into that
> > location. The assumption is that userspace is handling its own object
> > locations (for example along with full-ppgtt) and that the kernel will
> > rarely have to make space for the user's requests.
> >
> > This extends the DRM_IOCTL_I915_GEM_EXECBUFFER2 to do the following:
> > * if the user supplies a virtual address via the execobject->offset
> > *and* sets the EXEC_OBJECT_PINNED flag in execobject->flags, then
> > that object is placed at that offset in the address space selected
> > by the context specifier in execbuffer.
> > * the location must be aligned to the GTT page size, 4096 bytes
> > * as the object is placed exactly as specified, it may be used by this
> > execbuffer call without relocations pointing to it
> >
> > It may fail to do so if:
> > * EINVAL is returned if the object does not have a 4096 byte aligned
> > address
> > * the object conflicts with another pinned object (either pinned by
> > hardware in that address space, e.g. scanouts in the aliasing ppgtt)
> > or within the same batch.
> > EBUSY is returned if the location is pinned by hardware
> > EINVAL is returned if the location is already in use by the batch
> > * EINVAL is returned if the object conflicts with its own alignment (as meets
> > the hardware requirements) or if the placement of the object does not fit
> > within the address space
> >
> > All other execbuffer errors apply.
> >
> > Presence of this execbuf extension may be queried by passing
> > I915_PARAM_HAS_EXEC_SOFTPIN to DRM_IOCTL_I915_GETPARAM and checking for
> > a reported value of 1 (or greater).
> >
> > Fixes: 506a8e87d8d2 ("drm/i915: Add soft-pinning API for execbuffer")
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > ---
> > drivers/gpu/drm/drm_mm.c | 9 ++++
> > drivers/gpu/drm/i915/gvt/aperture_gm.c | 4 +-
> > drivers/gpu/drm/i915/i915_drv.h | 3 +-
> > drivers/gpu/drm/i915/i915_gem_evict.c | 87 +++++++++++++++++++++---------
> > drivers/gpu/drm/i915/i915_gem_execbuffer.c | 3 +-
> > drivers/gpu/drm/i915/i915_gem_gtt.c | 10 ++--
> > drivers/gpu/drm/i915/i915_trace.h | 23 ++++++++
> > drivers/gpu/drm/i915/i915_vma.c | 8 +--
> > 8 files changed, 105 insertions(+), 42 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c
> > index 632473beb40c..06f319e54dc1 100644
> > --- a/drivers/gpu/drm/drm_mm.c
> > +++ b/drivers/gpu/drm/drm_mm.c
> > @@ -339,6 +339,15 @@ int drm_mm_reserve_node(struct drm_mm *mm, struct drm_mm_node *node)
> > if (hole_start > node->start || hole_end < end)
> > return -ENOSPC;
> >
> > + if (mm->color_adjust) {
> > + u64 adj_start = hole_start, adj_end = hole_end;
> > +
> > + mm->color_adjust(hole, node->color, &adj_start, &adj_end);
> > + if (adj_start > node->start ||
> > + adj_end < node->start + node->size)
> > + return -ENOSPC;
> > + }
>
> Can't we move that up and combine with the earlier check like in other
> places, i.e.
Unlike the other places we don't use adj_start, adj_end.
> > --- a/drivers/gpu/drm/i915/gvt/aperture_gm.c
> > +++ b/drivers/gpu/drm/i915/gvt/aperture_gm.c
> > @@ -73,12 +73,12 @@ static int alloc_gm(struct intel_vgpu *vgpu, bool high_gm)
> > mutex_lock(&dev_priv->drm.struct_mutex);
> > search_again:
> > ret = drm_mm_insert_node_in_range_generic(&dev_priv->ggtt.base.mm,
> > - node, size, 4096, 0,
> > + node, size, 4096, -1,
> > start, end, search_flag,
> > alloc_flag);
> > if (ret) {
> > ret = i915_gem_evict_something(&dev_priv->ggtt.base,
> > - size, 4096, 0, start, end, 0);
> > + size, 4096, -1, start, end, 0);
>
> Bugfix for gvt coloring? Should we have a GVT_COLOR? Feels like separate
> patch once more.
No, copy paste of broken code. The pattern was broken by evict_for_vma.
> > -int
> > -i915_gem_evict_for_vma(struct i915_vma *target)
> > +int i915_gem_evict_for_vma(struct i915_vma *target, unsigned int flags)
>
> Kerneldoc for this one here is missing.
There should be exactly one caller. It is not an interface that should
be used.
> > {
> > - struct drm_mm_node *node, *next;
> > + struct list_head eviction_list;
> > + struct drm_mm_node *node;
> > + u64 start = target->node.start;
> > + u64 end = start + target->node.size - 1;
> > + struct i915_vma *vma, *next;
> > + bool check_snoop;
> > + int ret = 0;
> >
> > lockdep_assert_held(&target->vm->dev->struct_mutex);
> > + trace_i915_gem_evict_vma(target, flags);
> > +
> > + check_snoop = target->vm->mm.color_adjust;
> > + if (check_snoop) {
> > + if (start > target->vm->start)
> > + start -= 4096;
> > + if (end < target->vm->start + target->vm->total)
> > + end += 4096;
> > + }
> >
> > - list_for_each_entry_safe(node, next,
> > - &target->vm->mm.head_node.node_list,
> > - node_list) {
> > - struct i915_vma *vma;
> > - int ret;
> > + node = drm_mm_interval_first(&target->vm->mm, start, end);
>
> drm_mm_interval_first is lacking some pretty kernel-doc, should be fixed!
>
> > + if (!node)
> > + return 0;
> >
> > - if (node->start + node->size <= target->node.start)
> > - continue;
> > - if (node->start >= target->node.start + target->node.size)
> > + INIT_LIST_HEAD(&eviction_list);
> > + vma = container_of(node, typeof(*vma), node);
> > + list_for_each_entry_from(vma,
> > + &target->vm->mm.head_node.node_list,
> > + node.node_list) {
> > + if (vma->node.start > end)
> > break;
>
> We have this nice drm_mm_interval_next helper, imo should use it here
> instead of open-coding.
It's not the same. This is a linear walk which is fewer pointer dances.
> >
> > - vma = container_of(node, typeof(*vma), node);
> > + if (check_snoop) {
> > + if (vma->node.start + vma->node.size == target->node.start) {
> > + if (vma->node.color == target->node.color)
> > + continue;
> > + }
> > + if (vma->node.start == target->node.start + target->node.size) {
> > + if (vma->node.color == target->node.color)
> > + continue;
> > + }
> > + }
>
> Not feeling too happy with open-coding our color_adjust here.
Good, because it doesn't.
> > - if (i915_vma_is_pinned(vma)) {
> > - if (!vma->exec_entry || i915_vma_pin_count(vma) > 1)
> > - /* Object is pinned for some other use */
> > - return -EBUSY;
> > + if (vma->node.color == -1) {
>
> The magic -1!
And now you see why it is absolutely required. And not any more magic
than any other color value.
> > diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
> > index 738ff3a5cd6e..827eaeb75524 100644
> > --- a/drivers/gpu/drm/i915/i915_vma.c
> > +++ b/drivers/gpu/drm/i915/i915_vma.c
> > @@ -325,12 +325,6 @@ bool i915_gem_valid_gtt_space(struct i915_vma *vma,
> > if (vma->vm->mm.color_adjust == NULL)
> > return true;
> >
> > - if (!drm_mm_node_allocated(gtt_space))
> > - return true;
> > -
> > - if (list_empty(>t_space->node_list))
> > - return true;
>
> Not clear to me why we need this: set_cache_level should never see
> pre-filled vmas, and vma_insert only has this check in the success case
> where the same should hold. Why remove these 2 cases? This seems to change
> set_cache_level behaviour ...
Because they're irrelevant. We only get called when bound and the second
is not our bug.
There's no change in caller behaviour.
--
Chris Wilson, Intel Open Source Technology Centre
More information about the Intel-gfx
mailing list