[RFC PATCH 0/3] Propose new struct drm_mem_region

Brian Welty brian.welty at intel.com
Wed Jul 31 01:19:50 UTC 2019


On 7/30/2019 2:34 AM, Daniel Vetter wrote:
> On Tue, Jul 30, 2019 at 08:45:57AM +0000, Koenig, Christian wrote:
>> Yeah, that looks like a good start. Just a couple of random design 
>> comments/requirements.
>>
>> First of all please restructure the changes so that you more or less 
>> have the following:
>> 1. Adding of the new structures and functionality without any change to 
>> existing code.
>> 2. Replacing the existing functionality in TTM and all of its drivers.
>> 3. Replacing the existing functionality in i915.
>>
>> This should make it much easier to review the new functionality when it 
>> is not mixed with existing TTM stuff.

Sure, understood.  But I hope it's fair that I wouldn't be updating all
drivers in an RFC series until there is a bit of clarity/agreement on any
path forward.  But I can include amdgpu patch next time.

>>
>>
>> Second please completely drop the concept of gpu_offset or start of the 
>> memory region like here:
>>> drm_printf(p, "    gpu_offset: 0x%08llX\n", man->region.start);
>> At least on AMD hardware we have the following address spaces which are 
>> sometimes even partially overlapping: VM, MC, SYSTEM, FB, AGP, XGMI, bus 
>> addresses and physical addresses.
>>
>> Pushing a concept of a general GPU address space into the memory 
>> management was a rather bad design mistake in TTM and we should not 
>> repeat that here.
>>
>> A region should only consists of a size in bytes and (internal to the 
>> region manager) allocations in that region.

Got it. I was trying to include fields that seemed relevant to a base
structure and could then optionally be leveraged at the choice of device
driver.  But I see your point.

>>
>>
>> Third please don't use any CPU or architecture specific types in any 
>> data structures:
>>> +struct drm_mem_region {
>>> +	resource_size_t start; /* within GPU physical address space */
>>> +	resource_size_t io_start; /* BAR address (CPU accessible) */
>>> +	resource_size_t size;
>>
>> I knew that resource_size is mostly 64bit on modern architectures, but 
>> dGPUs are completely separate to the architecture and we always need 
>> 64bits here at least for AMD hardware.
>>
>> So this should either be always uint64_t, or something like 
>> gpu_resource_size which depends on what the compiled in drivers require 
>> if we really need that.
>>
>> And by the way: Please always use bytes for things like sizes and not 
>> number of pages, cause page size is again CPU/architecture specific and 
>> GPU drivers don't necessary care about that.

Makes sense,  will fix.

Hmm,  I did hope that at least the DRM cgroup controller could leverage
struct page_counter.  It encapsulates nicely much of the fields for 
managing a memory limit.  But well, this is off topic....

>>
>>
>> And here also a few direct comments on the code:
>>> +	union {
>>> +		struct drm_mm *mm;
>>> +		/* FIXME (for i915): struct drm_buddy_mm *buddy_mm; */
>>> +		void *priv;
>>> +	};
>> Maybe just always use void *mm here.
> 
> I'd say lets drop this outright, and handle private data by embedding this
> structure in the right place. That's how we handle everything in drm now
> as much as possible, and it's so much cleaner. I think ttm still loves
> priv pointers a bit too much in some places.

Okay, I'll drop it until I might be able to prove this might be useful later.

>
>>> +	spinlock_t move_lock;
>>> +	struct dma_fence *move;
>>
>> That is TTM specific and I'm not sure if we want it in the common memory 
>> management handling.
>>
>> If we want that here we should probably replace the lock with some rcu 
>> and atomic fence pointer exchange first.
> 
> Yeah  not sure we want any of these details in this shared structure
> either.
> 

Thanks for the feedback. I can remove it too.
I was unsure if might be a case for having it in future.

Well, struct drm_mem_region will be quite small then if it only has a
size and type field.
Hardly seems worth introducing a new structure if these are the only fields.
I know we thought it might benefit cgroups controller,  but I still hope to
find earlier purpose it could serve.

-Brian


[snip]

>>
>> Am 30.07.19 um 02:32 schrieb Brian Welty:
>>> [ By request, resending to include amd-gfx + intel-gfx.  Since resending,
>>>    I fixed the nit with ordering of header includes that Sam noted. ]
>>>
>>> This RFC series is first implementation of some ideas expressed
>>> earlier on dri-devel [1].
>>>
>>> Some of the goals (open for much debate) are:
>>>    - Create common base structure (subclass) for memory regions (patch #1)
>>>    - Create common memory region types (patch #2)
>>>    - Create common set of memory_region function callbacks (based on
>>>      ttm_mem_type_manager_funcs and intel_memory_regions_ops)
>>>    - Create common helpers that operate on drm_mem_region to be leveraged
>>>      by both TTM drivers and i915, reducing code duplication
>>>    - Above might start with refactoring ttm_bo_manager.c as these are
>>>      helpers for using drm_mm's range allocator and could be made to
>>>      operate on DRM structures instead of TTM ones.
>>>    - Larger goal might be to make LRU management of GEM objects common, and
>>>      migrate those fields into drm_mem_region and drm_gem_object strucures.
> 
> I'm not sure how much of all that we really want in a drm_mem_region ...
> Otherwise we just reimplement the same midlayer we have already, but with
> a drm_ instead of ttm_ prefix. And that seems a bit silly.
> -Daniel
> 
>>>
>>> Patches 1-2 implement the proposed struct drm_mem_region and adds
>>> associated common set of definitions for memory region type.
>>>
>>> Patch #3 is update to i915 and is based upon another series which is
>>> in progress to add vram support to i915 [2].
>>>
>>> [1] https://lists.freedesktop.org/archives/dri-devel/2019-June/224501.html
>>> [2] https://lists.freedesktop.org/archives/intel-gfx/2019-June/203649.html
>>>
>>> Brian Welty (3):
>>>    drm: introduce new struct drm_mem_region
>>>    drm: Introduce DRM_MEM defines for specifying type of drm_mem_region
>>>    drm/i915: Update intel_memory_region to use nested drm_mem_region
>>>
>>>   drivers/gpu/drm/i915/gem/i915_gem_object.c    |  2 +-
>>>   drivers/gpu/drm/i915/gem/i915_gem_shmem.c     |  2 +-
>>>   drivers/gpu/drm/i915/i915_gem_gtt.c           | 10 ++---
>>>   drivers/gpu/drm/i915/i915_gpu_error.c         |  2 +-
>>>   drivers/gpu/drm/i915/i915_query.c             |  2 +-
>>>   drivers/gpu/drm/i915/intel_memory_region.c    | 10 +++--
>>>   drivers/gpu/drm/i915/intel_memory_region.h    | 19 +++------
>>>   drivers/gpu/drm/i915/intel_region_lmem.c      | 26 ++++++-------
>>>   .../drm/i915/selftests/intel_memory_region.c  |  8 ++--
>>>   drivers/gpu/drm/ttm/ttm_bo.c                  | 34 +++++++++-------
>>>   drivers/gpu/drm/ttm/ttm_bo_manager.c          | 14 +++----
>>>   drivers/gpu/drm/ttm/ttm_bo_util.c             | 11 +++---
>>>   drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c |  8 ++--
>>>   drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c    |  4 +-
>>>   include/drm/drm_mm.h                          | 39 ++++++++++++++++++-
>>>   include/drm/ttm/ttm_bo_api.h                  |  2 +-
>>>   include/drm/ttm/ttm_bo_driver.h               | 16 ++++----
>>>   include/drm/ttm/ttm_placement.h               |  8 ++--
>>>   18 files changed, 124 insertions(+), 93 deletions(-)
>>>
>>
> 


More information about the amd-gfx mailing list