[Intel-gfx] [PATCH 3/3] drm/i915/gtt: ignore min_page_size for paging structures

Wed Jun 23 12:44:30 UTC 2021

On Wed, 2021-06-23 at 13:25 +0100, Matthew Auld wrote:
> On 23/06/2021 12:51, Thomas Hellström wrote:
> > 
> > On 6/23/21 1:26 PM, Matthew Auld wrote:
> > > The min_page_size is only needed for pages inserted into the GTT,
> > > and
> > > for our paging structures we only need at most 4K bytes, so
> > > simply
> > > ignore the min_page_size restrictions here, otherwise we might
> > > see some
> > > severe overallocation on some devices.
> > > 
> > > Signed-off-by: Matthew Auld <matthew.auld at intel.com>
> > > Cc: Thomas Hellström <thomas.hellstrom at linux.intel.com>
> > > ---
> > >   drivers/gpu/drm/i915/gt/intel_gtt.c | 2 +-
> > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c 
> > > b/drivers/gpu/drm/i915/gt/intel_gtt.c
> > > index 084ea65d59c0..61e8a8c25374 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_gtt.c
> > > +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c
> > > @@ -16,7 +16,7 @@ struct drm_i915_gem_object
> > > *alloc_pt_lmem(struct 
> > > i915_address_space *vm, int sz)
> > >   {
> > >       struct drm_i915_gem_object *obj;
> > > -    obj = i915_gem_object_create_lmem(vm->i915, sz, 0);
> > > +    obj = __i915_gem_object_create_lmem_with_ps(vm->i915, sz,
> > > sz, 0);
> > >       /*
> > >        * Ensure all paging structures for this vm share the same
> > > dma-resv
> > >        * object underneath, with the idea that one object_lock()
> > > will 
> > > lock
> > 
> > I think for this one the new gt migration code might break, because
> > there we insert even PT pages into the GTT, so it might need a
> > special 
> > interface? Ram is looking at supporter larger GPU PTE sizes with
> > that 
> > code..
> 
> For DG1 at least we don't need this. But yeah we can always just pass
> along the page size when allocating the stash I guess, if we need 
> something special for migration?
> 
> But when we need to support huge PTEs for stuff other than DG1, then 
> it's still a pile of work I assume, since we still need all the
> special 
> PTE insertion routines specifically for insert_pte() which will
> differ 
> wildly between generations, also each has quite different
> restrictions 
> wrt min physical alignment of lmem, whether you can mix 64K/4K PTEs
> in 
> the same 2M va range, whether 4K PTEs are even supported for lmem
> etc.
> 
> Not sure if it's simpler to go with mapping all of lmem upfront with
> the 
> flat-ppGTT? Maybe that sidesteps some of these issues? At least for
> the 
> physical alignment of paging structures that would no longer be a
> concern.

Yes, that might be the simplest way forward.

/Thomas

> 
> > 
> > /Thomas
> > 
> > 
> >