[PATCH 1/5] drm/xe/svm: Remap and provide memmap backing for GPU vram

Zeng, Oak oak.zeng at intel.com
Fri Mar 15 23:11:31 UTC 2024



> -----Original Message-----
> From: Welty, Brian <brian.welty at intel.com>
> Sent: Friday, March 15, 2024 2:05 PM
> To: Zeng, Oak <oak.zeng at intel.com>; intel-xe at lists.freedesktop.org
> Cc: Hellstrom, Thomas <thomas.hellstrom at intel.com>; Brost, Matthew
> <matthew.brost at intel.com>; airlied at gmail.com; Ghimiray, Himal Prasad
> <himal.prasad.ghimiray at intel.com>
> Subject: Re: [PATCH 1/5] drm/xe/svm: Remap and provide memmap backing for
> GPU vram
> 
> 
> On 3/14/2024 8:16 PM, Zeng, Oak wrote:
> [...]
> >>>> diff --git a/drivers/gpu/drm/xe/xe_mmio.c
> >> b/drivers/gpu/drm/xe/xe_mmio.c
> >>>> index e3db3a178760..0d795394bc4c 100644
> >>>> --- a/drivers/gpu/drm/xe/xe_mmio.c
> >>>> +++ b/drivers/gpu/drm/xe/xe_mmio.c
> >>>> @@ -22,6 +22,7 @@
> >>>>    #include "xe_module.h"
> >>>>    #include "xe_sriov.h"
> >>>>    #include "xe_tile.h"
> >>>> +#include "xe_svm.h"
> >>>>
> >>>>    #define XEHP_MTCFG_ADDR		XE_REG(0x101800)
> >>>>    #define TILE_COUNT		REG_GENMASK(15, 8)
> >>>> @@ -286,6 +287,7 @@ int xe_mmio_probe_vram(struct xe_device *xe)
> >>>>    		}
> >>>>
> >>>>    		io_size -= min_t(u64, tile_size, io_size);
> >>>> +		xe_svm_devm_add(tile, &tile->mem.vram);
> >>>
> >>> I think slightly more appropriate call site for this might be
> >>> xe_tile_init_noalloc(), as that function states it is preparing tile
> >>> for VRAM allocations.
> >>> Also, I mention because we might like the flexiblity in future to call
> >>> this once for xe_device.mem.vram, instead of calling for each tile,
> >>> and easier to handle that in xe_tile.c instead of xe_mmio.c.
> >>
> >> Good point. Will follow.
> >
> > Sorry, with my comment below, do you still want to call it in xe_tile_init_noalloc?
> >
> > For UMA, we only need to call it once. If you do it in init-noalloc, you would call
> it multiple times. Right?
> >
> > Oak
> >
> 
> Oh, I hoped you were still going to move it.
> I prefer it in xe_tile.c instead of here.  But feel free to ask
> maintainers about it.
> I think the UMA would anyway add conditional to disable these calls
> wherever they are placed, in favor of calling against xe_device.mem.vram.
> But this is pretty minor and can revisit later too.

Let me move to xe_tile.c then... we can make whatever change when UMA is settled down.

> 
> I commented down below, to keep to one email.
> 
> >>>
> >>> Related comment below.
> >>>
> >>>
> >>>>    	}
> >>>>
> >>>>    	xe->mem.vram.actual_physical_size = total_size;
> >>>> @@ -354,10 +356,16 @@ void xe_mmio_probe_tiles(struct xe_device *xe)
> >>>>    static void mmio_fini(struct drm_device *drm, void *arg)
> >>>>    {
> >>>>    	struct xe_device *xe = arg;
> >>>> +    struct xe_tile *tile;
> >>>> +    u8 id;
> >>>>
> >>>>    	pci_iounmap(to_pci_dev(xe->drm.dev), xe->mmio.regs);
> >>>>    	if (xe->mem.vram.mapping)
> >>>>    		iounmap(xe->mem.vram.mapping);
> >>>> +
> >>>> +	for_each_tile(tile, xe, id)
> >>>> +		xe_svm_devm_remove(xe, &tile->mem.vram);
> >>>> +
> >>>>    }
> >>>>
> [...]
> >>> b/drivers/gpu/drm/xe/xe_svm_devmem.c
> >>>> new file mode 100644
> >>>> index 000000000000..63b7a1961cc6
> >>>> --- /dev/null
> >>>> +++ b/drivers/gpu/drm/xe/xe_svm_devmem.c
> >>>> @@ -0,0 +1,91 @@
> >>>> +// SPDX-License-Identifier: MIT
> >>>> +/*
> >>>> + * Copyright © 2023 Intel Corporation
> >>>> + */
> >>>> +
> >>>> +#include <linux/mm_types.h>
> >>>> +#include <linux/sched/mm.h>
> >>>> +
> >>>> +#include "xe_device_types.h"
> >>>> +#include "xe_trace.h"
> >>>> +#include "xe_svm.h"
> >>>> +
> >>>> +
> >>>> +static vm_fault_t xe_devm_migrate_to_ram(struct vm_fault *vmf)
> >>>> +{
> >>>> +	return 0;
> >>>> +}
> >>>> +
> >>>> +static void xe_devm_page_free(struct page *page)
> >>>> +{
> >>>> +}
> >>>> +
> >>>> +static const struct dev_pagemap_ops xe_devm_pagemap_ops = {
> >>>> +	.page_free = xe_devm_page_free,
> >>>> +	.migrate_to_ram = xe_devm_migrate_to_ram,
> >>>> +};
> >>>> +
> >>>> +/**
> >>>> + * xe_svm_devm_add: Remap and provide memmap backing for device
> >>> memory
> >>>
> >>> Do we really need 'svm' in function name?
> >>
> >> Good point. I will remove svm.
> >>>
> >>>> + * @tile: tile that the memory region blongs to
> >>>
> >>> We might like flexibility in future to call this once for
> >>> xe_device.mem.vram, instead of calling for each tile.
> >>> So can we remove the tile argument, and just pass the xe_device pointer
> >>> and tile->id ?
> >>
> >> This is interesting.
> >>
> >> First of all, I programmed wrong below: mr->pagemap.owner = tile->xe-
> >>> drm.dev;
> >>
> >> This should be: mr->pagemap.owner = tile for NUMA vram system
> Oh, okay.  Glad we caught that.
> 
> Still, wouldn't using 'mr' pointer as the owner be sufficient and then
> is common for both cases?
> Avoiding the strong association with 'tile' is nice if possible.
> 
> But I haven't studied rest of code and how pagemap.owner is used, so
> maybe you actually need the tile pointer for something.

mr is more unified. But when I looked further, for now we still need to set the owner to dev. So I take back the comment of setting owner to tile. 

The reason is, we also need to set owner calling hmm_range_fault, see patch 5. At patch 5, we don't have the mr and tile, we only have vma and vm. And from vm we can only get dev, not tile and mr...

It looks like the xe_vm is designed per device, not per tile. I wouldn't change this perspective in this series. Making vm per tile is doable. But the main purpose of this series is to proof the system allocator concept on pvc and position it for new platform which might have uma architecture b/t tiles. So optimization such as migration b/t tiles is low priority.

So for now, lets still go back to dev for the owner.

Oak

> 
> Well, this also seems like something minor and could be revisited later.
> 
> -Brian
> 
> 
> >>
> >> For UMA vram, this should be: mr->pagemap.owner = tile_to_xe(tile);
> >>
> >> This owner is important. It is used later to decide migration by hmm. We need
> to
> >> set the owner for hmm to identify different vram region.
> >>
> >> Based on above, I think the tile parameter is better. For UMA, caller need to
> >> make sure call it once, any tile pointer should work. This does sound a little
> weird.
> >> But I don’t have a better idea.
> >>
> >> Oak
> >>>
> >>>
> >>>> + * @mr: memory region to remap
> >>>> + *
> >>>> + * This remap device memory to host physical address space and create
> >>>> + * struct page to back device memory
> >>>> + *
> >>>> + * Return: 0 on success standard error code otherwise
> >>>> + */
> >>>> +int xe_svm_devm_add(struct xe_tile *tile, struct xe_mem_region *mr)
> >>>> +{
> >>>> +	struct device *dev = &to_pci_dev(tile->xe->drm.dev)->dev;
> >>>> +	struct resource *res;
> >>>> +	void *addr;
> >>>> +	int ret;
> >>>> +
> >>>> +	res = devm_request_free_mem_region(dev, &iomem_resource,
> >>>> +					   mr->usable_size);
> >>>> +	if (IS_ERR(res)) {
> >>>> +		ret = PTR_ERR(res);
> >>>> +		return ret;
> >>>> +	}
> >>>> +
> >>>> +	mr->pagemap.type = MEMORY_DEVICE_PRIVATE;
> >>>> +	mr->pagemap.range.start = res->start;
> >>>> +	mr->pagemap.range.end = res->end;
> >>>> +	mr->pagemap.nr_range = 1;
> >>>> +	mr->pagemap.ops = &xe_devm_pagemap_ops;
> >>>> +	mr->pagemap.owner = tile->xe->drm.dev;
> >>>> +	addr = devm_memremap_pages(dev, &mr->pagemap);
> >>>> +	if (IS_ERR(addr)) {
> >>>> +		devm_release_mem_region(dev, res->start, resource_size(res));
> >>>> +		ret = PTR_ERR(addr);
> >>>> +		drm_err(&tile->xe->drm, "Failed to remap tile %d memory,
> >>> errno %d\n",
> >>>> +				tile->id, ret);
> >>>> +		return ret;
> >>>> +	}
> >>>> +	mr->hpa_base = res->start;
> >>>> +
> >>>> +	drm_info(&tile->xe->drm, "Added tile %d memory [%llx-%llx] to devm,
> >>> remapped to %pr\n",
> >>>> +			tile->id, mr->io_start, mr->io_start + mr->usable_size,
> >>> res);
> >>>> +	return 0;
> >>>> +}
> >>>> +
> >>>> +/**
> >>>> + * xe_svm_devm_remove: Unmap device memory and free resources
> >>>> + * @xe: xe device
> >>>> + * @mr: memory region to remove
> >>>> + */
> >>>> +void xe_svm_devm_remove(struct xe_device *xe, struct
> xe_mem_region
> >>> *mr)
> >>>> +{
> >>>> +	/*FIXME: below cause a kernel hange during moduel remove*/
> >>>> +#if 0
> >>>> +	struct device *dev = &to_pci_dev(xe->drm.dev)->dev;
> >>>> +
> >>>> +	if (mr->hpa_base) {
> >>>> +		devm_memunmap_pages(dev, &mr->pagemap);
> >>>> +		devm_release_mem_region(dev, mr->pagemap.range.start,
> >>>> +			mr->pagemap.range.end - mr->pagemap.range.start +1);
> >>>> +	}
> >>>> +#endif
> >>>> +}
> >>>> +


More information about the Intel-xe mailing list