[RFC PATCH 0/3] Propose new struct drm_mem_region
Daniel Vetter
daniel at ffwll.ch
Wed Jul 31 08:05:32 UTC 2019
On Wed, Jul 31, 2019 at 8:54 AM Koenig, Christian
<Christian.Koenig at amd.com> wrote:
>
> Am 31.07.19 um 02:51 schrieb Brian Welty:
> [SNIP]
> >>>>>> +/*
> >>>>>> + * Memory types for drm_mem_region
> >>>>>> + */
> >>>>> #define DRM_MEM_SWAP ?
> >>>> btw what did you have in mind for this? Since we use shmem we kinda don't
> >>>> know whether the BO is actually swapped out or not, at least on the i915
> >>>> side. So this would be more NOT_CURRENTLY_PINNED_AND_POSSIBLY_SWAPPED_OUT.
> >>> Yeah, the problem is not everybody can use shmem. For some use cases you
> >>> have to use memory allocated through dma_alloc_coherent().
> >>>
> >>> So to be able to swap this out you need a separate domain to copy it
> >>> from whatever is backing it currently to shmem.
> >>>
> >>> So we essentially have:
> >>> DRM_MEM_SYS_SWAPABLE
> >>> DRM_MEM_SYS_NOT_GPU_MAPPED
> >>> DRM_MEM_SYS_GPU_MAPPED
> >>>
> >>> Or something like that.
> >> Yeah i915-gem is similar. We oportunistically keep the pages pinned
> >> sometimes even if not currently mapped into the (what ttm calls) TT.
> >> So I think these three for system memory make sense for us too. I
> >> think that's similar (at least in spirit) to the dma_alloc cache you
> >> have going on. Mabye instead of the somewhat cumbersome NOT_GPU_MAPPED
> >> we could have something like PINNED or so. Although it's not
> >> permanently pinned, so maybe that's confusing too.
> >>
> > Okay, I see now I was far off the mark with what I thought TTM_PL_SYSTEM
> > was. The discussion helped clear up several bits of confusion on my part.
> > From proposed names, I find MAPPED and PINNED slightly confusing.
> > In terms of backing store description, maybe these are a little better:
> > DRM_MEM_SYS_UNTRANSLATED (TTM_PL_SYSTEM)
> > DRM_MEM_SYS_TRANSLATED (TTM_PL_TT or i915's SYSTEM)
>
> That's still not correct. Let me describe what each of the tree stands for:
>
> 1. The backing store is a shmem file so the individual pages are
> swapable by the core OS.
> 2. The backing store is allocate GPU accessible but not currently in use
> by the GPU.
> 3. The backing store is currently in use by the GPU.
>
> For i915 all three of those are basically the same and you only need to
> worry about it much.
We do pretty much have these three states for i915 gem bo too. Of
course none have a reasonable upper limit since it's all shared
memory. The hard limit would be system memory + swap for 1, and only
system memory for 2 and 3.
> But for other drivers that's certainly not true and we need this
> distinction of the backing store of an object.
>
> I'm just not sure how we would handle that for cgroups. From experience
> we certainly want a limit over all 3, but you usually also want to limit
> 3 alone.
To avoid lolz against the shrinker I think you also want to limit 2+3.
Afaiui ttm does that with the global limit, to avoid driving the
system against the wall.
> And you also want to limit the amount of bytes moved between those
> states because each state transition might have a bandwidth cost
> associated with it.
>
> > Are these allowed to be both overlapping? Or non-overlapping (partitioned)?
> > Per Christian's point about removing .start, seems it doesn't need to
> > matter.
>
> You should probably completely drop the idea of this being regions.
>
> And we should also rename them to something like drm_mem_domains to make
> that clear.
+1 on domains. Some of these domains might be physically contiguous
regions, but some clearly arent.
> > Whatever we define for these sub-types, does it make sense for SYSTEM and
> > VRAM to each have them defined?
>
> No, absolutely not. VRAM as well as other private memory types are
> completely driver specific.
>
> > I'm unclear how DRM_MEM_SWAP (or DRM_MEM_SYS_SWAPABLE) would get
> > configured by driver... this is a fixed size partition of host memory?
> > Or it is a kind of dummy memory region just for swap implementation?
>
> #1 and #2 in my example above should probably not be configured by the
> driver itself.
>
> And yes seeing those as special for state handling sounds like the
> correct approach to me.
Do we have any hw that wants custom versions of 3? The only hw designs
I know of either have one shared translation table (but only one per
device, so having just 1 domain is good enough). Or TT mappings are in
the per-process pagetables, and then you're defacto unlimited (and
again one domain is good enough). So roughly:
- 1&2 global accross all drivers. 1 and 2 are disjoint (i.e. a bo is
only account to one of them, never both).
- 3 would be a subgroup of 2, and per device. A bo in group 3 is also
always in group 2.
For VRAM and VRAM-similar things (like stolen system memory, or if you
have VRAM that's somehow split up like with a dual gpu perhaps) I
agree the driver needs to register that. And we just have some
standard flags indicating that "this is kinda like VRAM".
-Daniel
>
> Regards,
> Christian.
>
> >>>>> TTM was clearly missing that resulting in a whole bunch of extra
> >>>>> handling and rather complicated handling.
> >>>>>
> >>>>>> +#define DRM_MEM_SYSTEM 0
> >>>>>> +#define DRM_MEM_STOLEN 1
> >>>>> I think we need a better naming for that.
> >>>>>
> >>>>> STOLEN sounds way to much like stolen VRAM for integrated GPUs, but at
> >>>>> least for TTM this is the system memory currently GPU accessible.
> >>>> Yup this is wrong, for i915 we use this as stolen, for ttm it's the gpu
> >>>> translation table window into system memory. Not the same thing at all.
> >>> Thought so. The closest I have in mind is GTT, but everything else works
> >>> as well.
> >> Would your GPU_MAPPED above work for TT? I think we'll also need
> >> STOLEN, I'm even hearing noises that there's going to be stolen for
> >> discrete vram for us ... Also if we expand I guess we need to teach
> >> ttm to cope with more, or maybe treat the DRM one as some kind of
> >> sub-flavour.
> > Daniel, maybe what i915 calls stolen could just be DRM_MEM_RESERVED or
> > DRM_MEM_PRIV. Or maybe can argue it falls into UNTRANSLATED type that
> > I suggested above, I'm not sure.
> >
> > -Brian
> >
> >
> >> -Daniel
> >>
> >>> Christian.
> >>>
> >>>> -Daniel
> >>>>
> >>>>> Thanks for looking into that,
> >>>>> Christian.
> >>>>>
> >>>>> Am 30.07.19 um 02:32 schrieb Brian Welty:
> >>>>>> [ By request, resending to include amd-gfx + intel-gfx. Since resending,
> >>>>>> I fixed the nit with ordering of header includes that Sam noted. ]
> >>>>>>
> >>>>>> This RFC series is first implementation of some ideas expressed
> >>>>>> earlier on dri-devel [1].
> >>>>>>
> >>>>>> Some of the goals (open for much debate) are:
> >>>>>> - Create common base structure (subclass) for memory regions (patch #1)
> >>>>>> - Create common memory region types (patch #2)
> >>>>>> - Create common set of memory_region function callbacks (based on
> >>>>>> ttm_mem_type_manager_funcs and intel_memory_regions_ops)
> >>>>>> - Create common helpers that operate on drm_mem_region to be leveraged
> >>>>>> by both TTM drivers and i915, reducing code duplication
> >>>>>> - Above might start with refactoring ttm_bo_manager.c as these are
> >>>>>> helpers for using drm_mm's range allocator and could be made to
> >>>>>> operate on DRM structures instead of TTM ones.
> >>>>>> - Larger goal might be to make LRU management of GEM objects common, and
> >>>>>> migrate those fields into drm_mem_region and drm_gem_object strucures.
> >>>>>>
> >>>>>> Patches 1-2 implement the proposed struct drm_mem_region and adds
> >>>>>> associated common set of definitions for memory region type.
> >>>>>>
> >>>>>> Patch #3 is update to i915 and is based upon another series which is
> >>>>>> in progress to add vram support to i915 [2].
> >>>>>>
> >>>>>> [1] https://lists.freedesktop.org/archives/dri-devel/2019-June/224501.html
> >>>>>> [2] https://lists.freedesktop.org/archives/intel-gfx/2019-June/203649.html
> >>>>>>
> >>>>>> Brian Welty (3):
> >>>>>> drm: introduce new struct drm_mem_region
> >>>>>> drm: Introduce DRM_MEM defines for specifying type of drm_mem_region
> >>>>>> drm/i915: Update intel_memory_region to use nested drm_mem_region
> >>>>>>
> >>>>>> drivers/gpu/drm/i915/gem/i915_gem_object.c | 2 +-
> >>>>>> drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 2 +-
> >>>>>> drivers/gpu/drm/i915/i915_gem_gtt.c | 10 ++---
> >>>>>> drivers/gpu/drm/i915/i915_gpu_error.c | 2 +-
> >>>>>> drivers/gpu/drm/i915/i915_query.c | 2 +-
> >>>>>> drivers/gpu/drm/i915/intel_memory_region.c | 10 +++--
> >>>>>> drivers/gpu/drm/i915/intel_memory_region.h | 19 +++------
> >>>>>> drivers/gpu/drm/i915/intel_region_lmem.c | 26 ++++++-------
> >>>>>> .../drm/i915/selftests/intel_memory_region.c | 8 ++--
> >>>>>> drivers/gpu/drm/ttm/ttm_bo.c | 34 +++++++++-------
> >>>>>> drivers/gpu/drm/ttm/ttm_bo_manager.c | 14 +++----
> >>>>>> drivers/gpu/drm/ttm/ttm_bo_util.c | 11 +++---
> >>>>>> drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c | 8 ++--
> >>>>>> drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c | 4 +-
> >>>>>> include/drm/drm_mm.h | 39 ++++++++++++++++++-
> >>>>>> include/drm/ttm/ttm_bo_api.h | 2 +-
> >>>>>> include/drm/ttm/ttm_bo_driver.h | 16 ++++----
> >>>>>> include/drm/ttm/ttm_placement.h | 8 ++--
> >>>>>> 18 files changed, 124 insertions(+), 93 deletions(-)
> >>>>>>
> >>
>
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
More information about the amd-gfx
mailing list