[RFC PATCH 0/3] Propose new struct drm_mem_region
Daniel Vetter
daniel at ffwll.ch
Wed Jul 31 08:33:15 UTC 2019
On Wed, Jul 31, 2019 at 10:25:15AM +0200, Christian König wrote:
> Am 31.07.19 um 10:05 schrieb Daniel Vetter:
> > [SNIP]
> > > > Okay, I see now I was far off the mark with what I thought TTM_PL_SYSTEM
> > > > was. The discussion helped clear up several bits of confusion on my part.
> > > > From proposed names, I find MAPPED and PINNED slightly confusing.
> > > > In terms of backing store description, maybe these are a little better:
> > > > DRM_MEM_SYS_UNTRANSLATED (TTM_PL_SYSTEM)
> > > > DRM_MEM_SYS_TRANSLATED (TTM_PL_TT or i915's SYSTEM)
> > > That's still not correct. Let me describe what each of the tree stands for:
> > >
> > > 1. The backing store is a shmem file so the individual pages are
> > > swapable by the core OS.
> > > 2. The backing store is allocate GPU accessible but not currently in use
> > > by the GPU.
> > > 3. The backing store is currently in use by the GPU.
> > >
> > > For i915 all three of those are basically the same and you only need to
> > > worry about it much.
> > We do pretty much have these three states for i915 gem bo too. Of
> > course none have a reasonable upper limit since it's all shared
> > memory. The hard limit would be system memory + swap for 1, and only
> > system memory for 2 and 3.
>
> Good to know.
>
> > > But for other drivers that's certainly not true and we need this
> > > distinction of the backing store of an object.
> > >
> > > I'm just not sure how we would handle that for cgroups. From experience
> > > we certainly want a limit over all 3, but you usually also want to limit
> > > 3 alone.
> > To avoid lolz against the shrinker I think you also want to limit 2+3.
> > Afaiui ttm does that with the global limit, to avoid driving the
> > system against the wall.
>
> Yes, exactly. But I think you only need that when 2+3 are not backed by
> pinning shmem. E.g. for i915 I'm not sure you want this limitation.
Maybe I need to share how bad exactly the i915 driver is fighting its own
shrinker at the next conference, over some good drinks ... Just becaue we
use shmem directly doesn't make this easier really at all, we're still
pinning memory that the core mm can't evict anymore.
> > [SNIP]
> > > #1 and #2 in my example above should probably not be configured by the
> > > driver itself.
> > >
> > > And yes seeing those as special for state handling sounds like the
> > > correct approach to me.
> > Do we have any hw that wants custom versions of 3?
>
> I can't think of any. If a driver needs something special for 3 then that
> should be domain VRAM or domain PRIV.
>
> As far as I can see with the proposed separation we can even handle AGP.
>
> > The only hw designs
> > I know of either have one shared translation table (but only one per
> > device, so having just 1 domain is good enough). Or TT mappings are in
> > the per-process pagetables, and then you're defacto unlimited (and
> > again one domain is good enough). So roughly:
> >
> > - 1&2 global accross all drivers. 1 and 2 are disjoint (i.e. a bo is
> > only account to one of them, never both).
> > - 3 would be a subgroup of 2, and per device. A bo in group 3 is also
> > always in group 2.
>
> Yes, that sounds like a good description certainly like the right why to see
> it.
>
> > For VRAM and VRAM-similar things (like stolen system memory, or if you
> > have VRAM that's somehow split up like with a dual gpu perhaps) I
> > agree the driver needs to register that. And we just have some
> > standard flags indicating that "this is kinda like VRAM".
>
> Yeah, agree totally as well.
Cheers, Daniel
>
> Christian.
>
> > -Daniel
> >
> > > Regards,
> > > Christian.
> > >
> > > > > > > > TTM was clearly missing that resulting in a whole bunch of extra
> > > > > > > > handling and rather complicated handling.
> > > > > > > >
> > > > > > > > > +#define DRM_MEM_SYSTEM 0
> > > > > > > > > +#define DRM_MEM_STOLEN 1
> > > > > > > > I think we need a better naming for that.
> > > > > > > >
> > > > > > > > STOLEN sounds way to much like stolen VRAM for integrated GPUs, but at
> > > > > > > > least for TTM this is the system memory currently GPU accessible.
> > > > > > > Yup this is wrong, for i915 we use this as stolen, for ttm it's the gpu
> > > > > > > translation table window into system memory. Not the same thing at all.
> > > > > > Thought so. The closest I have in mind is GTT, but everything else works
> > > > > > as well.
> > > > > Would your GPU_MAPPED above work for TT? I think we'll also need
> > > > > STOLEN, I'm even hearing noises that there's going to be stolen for
> > > > > discrete vram for us ... Also if we expand I guess we need to teach
> > > > > ttm to cope with more, or maybe treat the DRM one as some kind of
> > > > > sub-flavour.
> > > > Daniel, maybe what i915 calls stolen could just be DRM_MEM_RESERVED or
> > > > DRM_MEM_PRIV. Or maybe can argue it falls into UNTRANSLATED type that
> > > > I suggested above, I'm not sure.
> > > >
> > > > -Brian
> > > >
> > > >
> > > > > -Daniel
> > > > >
> > > > > > Christian.
> > > > > >
> > > > > > > -Daniel
> > > > > > >
> > > > > > > > Thanks for looking into that,
> > > > > > > > Christian.
> > > > > > > >
> > > > > > > > Am 30.07.19 um 02:32 schrieb Brian Welty:
> > > > > > > > > [ By request, resending to include amd-gfx + intel-gfx. Since resending,
> > > > > > > > > I fixed the nit with ordering of header includes that Sam noted. ]
> > > > > > > > >
> > > > > > > > > This RFC series is first implementation of some ideas expressed
> > > > > > > > > earlier on dri-devel [1].
> > > > > > > > >
> > > > > > > > > Some of the goals (open for much debate) are:
> > > > > > > > > - Create common base structure (subclass) for memory regions (patch #1)
> > > > > > > > > - Create common memory region types (patch #2)
> > > > > > > > > - Create common set of memory_region function callbacks (based on
> > > > > > > > > ttm_mem_type_manager_funcs and intel_memory_regions_ops)
> > > > > > > > > - Create common helpers that operate on drm_mem_region to be leveraged
> > > > > > > > > by both TTM drivers and i915, reducing code duplication
> > > > > > > > > - Above might start with refactoring ttm_bo_manager.c as these are
> > > > > > > > > helpers for using drm_mm's range allocator and could be made to
> > > > > > > > > operate on DRM structures instead of TTM ones.
> > > > > > > > > - Larger goal might be to make LRU management of GEM objects common, and
> > > > > > > > > migrate those fields into drm_mem_region and drm_gem_object strucures.
> > > > > > > > >
> > > > > > > > > Patches 1-2 implement the proposed struct drm_mem_region and adds
> > > > > > > > > associated common set of definitions for memory region type.
> > > > > > > > >
> > > > > > > > > Patch #3 is update to i915 and is based upon another series which is
> > > > > > > > > in progress to add vram support to i915 [2].
> > > > > > > > >
> > > > > > > > > [1] https://lists.freedesktop.org/archives/dri-devel/2019-June/224501.html
> > > > > > > > > [2] https://lists.freedesktop.org/archives/intel-gfx/2019-June/203649.html
> > > > > > > > >
> > > > > > > > > Brian Welty (3):
> > > > > > > > > drm: introduce new struct drm_mem_region
> > > > > > > > > drm: Introduce DRM_MEM defines for specifying type of drm_mem_region
> > > > > > > > > drm/i915: Update intel_memory_region to use nested drm_mem_region
> > > > > > > > >
> > > > > > > > > drivers/gpu/drm/i915/gem/i915_gem_object.c | 2 +-
> > > > > > > > > drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 2 +-
> > > > > > > > > drivers/gpu/drm/i915/i915_gem_gtt.c | 10 ++---
> > > > > > > > > drivers/gpu/drm/i915/i915_gpu_error.c | 2 +-
> > > > > > > > > drivers/gpu/drm/i915/i915_query.c | 2 +-
> > > > > > > > > drivers/gpu/drm/i915/intel_memory_region.c | 10 +++--
> > > > > > > > > drivers/gpu/drm/i915/intel_memory_region.h | 19 +++------
> > > > > > > > > drivers/gpu/drm/i915/intel_region_lmem.c | 26 ++++++-------
> > > > > > > > > .../drm/i915/selftests/intel_memory_region.c | 8 ++--
> > > > > > > > > drivers/gpu/drm/ttm/ttm_bo.c | 34 +++++++++-------
> > > > > > > > > drivers/gpu/drm/ttm/ttm_bo_manager.c | 14 +++----
> > > > > > > > > drivers/gpu/drm/ttm/ttm_bo_util.c | 11 +++---
> > > > > > > > > drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c | 8 ++--
> > > > > > > > > drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c | 4 +-
> > > > > > > > > include/drm/drm_mm.h | 39 ++++++++++++++++++-
> > > > > > > > > include/drm/ttm/ttm_bo_api.h | 2 +-
> > > > > > > > > include/drm/ttm/ttm_bo_driver.h | 16 ++++----
> > > > > > > > > include/drm/ttm/ttm_placement.h | 8 ++--
> > > > > > > > > 18 files changed, 124 insertions(+), 93 deletions(-)
> > > > > > > > >
> >
>
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
More information about the amd-gfx
mailing list