[Intel-gfx] [PATCH 2/3] drm/i915: Introduce guard pages to i915_vma

Fri Feb 12 13:55:28 UTC 2021

Quoting Matthew Auld (2021-02-12 13:43:46)
> On Fri, 12 Feb 2021 at 10:22, Chris Wilson <chris at chris-wilson.co.uk> wrote:
> >
> > Introduce the concept of padding the i915_vma with guard pages before
> > and aft. The major consequence is that all ordinary uses of i915_vma
> > must use i915_vma_offset/i915_vma_size and not i915_vma.node.start/size
> > directly, as the drm_mm_node will include the guard pages that surround
> > our object.
> >
> > So in this patch, we look for all uses of i915_vma->node.start that
> > instead need to include the guard offset and switch them to
> > i915_vma_offset(), and in a few cases to i915_ggtt_offset(). Notable
> > exceptions are the selftests, which expect exact behaviour.
> >
> > The biggest connundrum is how exactly to mix request a fixed address
> > with guard pages, particular through the existing uABI. The user does
> > not know about guard pages, so such must be transparent to the user, and
> > so the execobj.offset must be that of the object itself excluding the
> > guard. So a PIN_OFFSET_FIXED must then be exclusive of the guard pages.
> >
> > In the next patch, we start using guard pages for scanout objects. While
> > these are limited to GGTT vma, on a few platforms these vma (or at least
> > an alias of the vma) is shared with userspace, so we may leak the
> > existence of such guards if we are not careful to ensure that the
> > execobj.offset is transparent and excludes the guards. (On such platforms,
> > without full-ppgtt, userspace has to use relocations so the presence of
> > more untouchable regions within its GTT such be of no further issue.)
> >
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > Cc: Matthew Auld <matthew.auld at intel.com>
> > ---
> >  drivers/gpu/drm/i915/gt/intel_ggtt.c  | 12 ++++++++++--
> >  drivers/gpu/drm/i915/i915_vma.c       | 10 +++++++---
> >  drivers/gpu/drm/i915/i915_vma.h       |  8 ++++----
> >  drivers/gpu/drm/i915/i915_vma_types.h |  3 ++-
> >  4 files changed, 23 insertions(+), 10 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> > index c5803c434d33..6b326138e765 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> > @@ -238,8 +238,12 @@ static void gen8_ggtt_insert_entries(struct i915_address_space *vm,
> >
> >         gte = (gen8_pte_t __iomem *)ggtt->gsm;
> >         gte += vma->node.start / I915_GTT_PAGE_SIZE;
> > -       end = gte + vma->node.size / I915_GTT_PAGE_SIZE;
> >
> > +       end = gte + vma->guard / I915_GTT_PAGE_SIZE;
> > +       while (gte < end)
> > +               gen8_set_pte(gte++, vm->scratch[0]->encode);
> > +
> > +       end += (vma->node.size - vma->guard) / I915_GTT_PAGE_SIZE;
> >         for_each_sgt_daddr(addr, iter, vma->pages)
> >                 gen8_set_pte(gte++, pte_encode | addr);
> >         GEM_BUG_ON(gte > end);
> > @@ -289,8 +293,12 @@ static void gen6_ggtt_insert_entries(struct i915_address_space *vm,
> >
> >         gte = (gen6_pte_t __iomem *)ggtt->gsm;
> >         gte += vma->node.start / I915_GTT_PAGE_SIZE;
> > -       end = gte + vma->node.size / I915_GTT_PAGE_SIZE;
> >
> > +       end = gte + vma->guard / I915_GTT_PAGE_SIZE;
> > +       while (gte < end)
> > +               gen8_set_pte(gte++, vm->scratch[0]->encode);
> > +
> > +       end += (vma->node.size - vma->guard) / I915_GTT_PAGE_SIZE;
> >         for_each_sgt_daddr(addr, iter, vma->pages)
> >                 iowrite32(vm->pte_encode(addr, level, flags), gte++);
> >         GEM_BUG_ON(gte > end);
> > diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
> > index 17fe455bd770..155f510b4cc6 100644
> > --- a/drivers/gpu/drm/i915/i915_vma.c
> > +++ b/drivers/gpu/drm/i915/i915_vma.c
> > @@ -623,7 +623,7 @@ bool i915_gem_valid_gtt_space(struct i915_vma *vma, unsigned long color)
> >  static int
> >  i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
> >  {
> > -       unsigned long color;
> > +       unsigned long color, guard;
> >         u64 start, end;
> >         int ret;
> >
> > @@ -631,13 +631,16 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
> >         GEM_BUG_ON(drm_mm_node_allocated(&vma->node));
> >
> >         size = max(size, vma->size);
> > -       alignment = max(alignment, vma->display_alignment);
> > +       alignment = max_t(typeof(alignment), alignment, vma->display_alignment);
> >         if (flags & PIN_MAPPABLE) {
> >                 size = max_t(typeof(size), size, vma->fence_size);
> >                 alignment = max_t(typeof(alignment),
> >                                   alignment, vma->fence_alignment);
> >         }
> >
> > +       guard = 0;
> > +       size += 2 * guard;
> > +
> >         GEM_BUG_ON(!IS_ALIGNED(size, I915_GTT_PAGE_SIZE));
> >         GEM_BUG_ON(!IS_ALIGNED(alignment, I915_GTT_MIN_ALIGNMENT));
> >         GEM_BUG_ON(!is_power_of_2(alignment));
> > @@ -674,7 +677,7 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags)
> >                         return -EINVAL;
> 
> Should we bother adjusting the overflows checking to account for the
> padding? Also maybe check that offset >= guard?

offset >= guard only for pinned.

And yes, it is sensible that we check we don't overflow with adding a guard.

> 
>         if (flags & PIN_OFFSET_FIXED) {
>                 u64 offset = flags & PIN_OFFSET_MASK;
> +
> +               if (offset < guard)
> +                       return -ENOSPC;
> +
>                 if (!IS_ALIGNED(offset, alignment) ||
> -                   range_overflows(offset, size, end))
> +                   range_overflows(offset, size - 2 * guard, end))
>                         return -EINVAL;
> 
> If the padding spills over the end, maybe it's ok, so long as the
> "real" range doesn't spill over, like if it needs to be placed within
> the mappable range. Otherwise maybe something like
> range_overflows(offset - guard, size - guard, end)?

My thought was that the user has no knowledge of the guard pages, nor
direct access to them [in theory], so they should not be penalised for
their placement, as far as is possible.

That we have to prevent wrap around and so restrict total placement to
[guard, end - guard] unfortunately belies our attempt to be
undetectable.

So the -EINVAL check should be against the full range, but we should
also do

if (offset < guard  || offset > gtt_end - guard)
	return -ENOSPC;

> Also bumping the alignment constraint for the OFFSET_FIXED case does
> seem a little scary? Is it not possible to just add the required
> padding around both ends of the provided offset?

Hmm. We don't need to bump the alignment, we just need to make sure the
guard is a multiple of the alignment (so that start + guard still meets
the alignment constraints).
-Chris