[Intel-gfx] [PATCH v4 2/3] drm/i915: refactor duplicate object vmap functions (reworked)
chris at chris-wilson.co.uk
Tue Feb 23 11:38:02 UTC 2016
On Tue, Feb 23, 2016 at 10:31:08AM +0000, Tvrtko Ursulin wrote:
> On 23/02/16 10:06, Chris Wilson wrote:
> >On Mon, Feb 22, 2016 at 04:06:39PM +0000, Tvrtko Ursulin wrote:
> >>[Cc Chris as the author of the idea.]
> >>On 22/02/16 15:18, Dave Gordon wrote:
> >>>This is essentially Chris Wilson's patch of a similar name, reworked on
> >>>top of Alex Dai's recent patch:
> >>> drm/i915: Add i915_gem_object_vmap to map GEM object to virtual space
> >>>Chris' original commentary said:
> >>> We now have two implementations for vmapping a whole object, one for
> >>> dma-buf and one for the ringbuffer. If we couple the vmapping into
> >>> the obj->pages lifetime, then we can reuse an obj->vmapping for both
> >>> and at the same time couple it into the shrinker.
> >>As a general concept my worry is that by implementing this approach
> >>we tie two unrelated concepts together.
> >>Shrinker is about backing storage (used/free pages in a machine),
> >>while vmap is about kernel address space. And then on 32-bit with
> >>its limited vmap space (128MiB, right?), it can become exhausted
> >>much sooner that the shrinker would be triggered. And we would rely
> >>on the shrinker running to free up address space as well now, right?
> >Yes, we use the shrinker to free address space.
> >>So unless I am missing something that doesn't fit well.
> >The opposite. Even more reason for the shrinker to be able to recover
> >vmap space on 32bit systems (for external users, not just ourselves).
> How? I don't see that failed vmapping will trigger shrinking. What
> will prevent i915 from accumulating objects with vmaps sticking
> around for too long potentially?
I read what you wrote as implying the shrinker would be run (which is
what I hoped). I made the mistake of just thinking of the allocations
around that path, rather than alloc_vmap_area().
Indeed, we would need a new notifier, pretty much for the sole use of
32b. Grr, that will be a nuisance.
> >>But as a related question, I wonder why doesn't the LRC require the
> >>same !HAS_LLC approach when mapping as ring buffer does here?
> >We don't try to use stolen for LRC. The main difficulty lies in
> >deciding how to map the state object, stolen forces us to use an
> >ioremapping through the GTT and so only suitable for write-only
> >mappings. However, we could be using the per-context HWS, for which we
> >want a CPU accessible, direct pointer.
> I wasn't asking about stolen but the !HAS_LLC path. Even non-stolen
> ring buffers will be mapped vie the aperture on !HAS_LLC platforms.
> That implies it is about cache coherency and we don't have the same
> treatment for the LRC page.
Oh, completely missed the point. Yes, I also wonder how the kmap() on
the context object works for Braswell without an explicit clflush of at
least the TAIL update between requests.
> Until your vma->iomap prototype which added the same uncached access
> to the LRC as well.
> So my question was, do we need this for cache considerations today,
> irrespective of caching the pointer in the VMA.
Yes, I think so, but it's hard to say as Braswell breaks in so many
different ways when the littlest of stress is applied.
Chris Wilson, Intel Open Source Technology Centre
More information about the Intel-gfx