[Intel-gfx] [PATCH v3 3/8] drm/i915: Add a new "remapped" gtt_view

Fri Oct 5 18:42:30 UTC 2018

On Mon, Oct 01, 2018 at 04:48:21PM +0100, Tvrtko Ursulin wrote:
> 
> On 01/10/2018 16:37, Chris Wilson wrote:
> > Quoting Ville Syrjälä (2018-10-01 16:27:43)
> >> On Mon, Oct 01, 2018 at 04:12:09PM +0100, Chris Wilson wrote:
> >>> Quoting Ville Syrjälä (2018-10-01 16:03:30)
> >>>> On Wed, Sep 26, 2018 at 08:50:25AM +0100, Tvrtko Ursulin wrote:
> >>>>>
> >>>>> On 25/09/2018 20:37, Ville Syrjala wrote:
> >>>>> One more thing, do you really need random access for this
> >>>>> transformation? Or you could walk the sg list as it is? Just if you hit
> >>>>> a too long chunk you need to copy a trimmed version over and know where
> >>>>> to continue for the next row. If doable it would be better than having
> >>>>> to kvmalloc_array.
> >>>>
> >>>> I think Chris suggested just using i915_gem_object_get_dma_address()
> >>>> here. But I'm not sure why we're not using it for rotate_pages()
> >>>> as well.
> >>>
> >>> Tvrtko is opposed to populating the obj->mm.pages cache with no defined
> >>> release point. I say the mempressure and shrinker should to the right
> >>> thing, but it's a big if.
> >>
> >> OK.
> >>
> >> Well, looks to me like i915_gem_object_get_dma_address() is the
> >> only convenient looking thing for iterating the pages without
> >> arowning the code in irrelevant details about sgs and whatnot.
> >> I suppose it should be possible to write some helpers that avoid
> >> all that and don't need the temp array, but I'm not really
> >> motivated enough to do that myself.
> > 
> > Keep it simple and use get_dma_address(). We can find ways to throw away
> > the cache later if need be.
> 
> I'd do it straight away. I think cache for a large framebuffer, the kind 
> which needs remapping could be quite big! Even the more fragmented 
> memory the bigger the cache, and so if it sticks around pointlessly for 
> the lifetime of the framebuffer it is a double whammy.

The tree is indexed with the ggtt offset so memory fragmentation
shouldn't matter I think. Or did I totally miss something?

The relative overhead should be highest for a single page object
(576/4096 = ~15%). For big objects it should be something around
.2% AFAICS.

The shrinker can throw the cache out for non-pinned fbs. Looks
like it does consider all fbs active so it tries not to kick
them out as the first thing though. For pinned fbs the cache
would remain allocated forever however.

Keeping the cache around should be great for panning within
large remapped fbs. Although I suppose that panning isn't a
very common use case these days

-- 
Ville Syrjälä
Intel