[PATCH 2/3] drm/xe: Restore system memory GGTT mappings

Matthew Brost matthew.brost at intel.com
Thu Oct 31 00:49:57 UTC 2024


On Wed, Oct 30, 2024 at 11:53:09AM +0000, Matthew Auld wrote:
> On 29/10/2024 14:57, Matthew Brost wrote:
> > On Tue, Oct 29, 2024 at 09:44:32AM +0000, Matthew Auld wrote:
> > > On 29/10/2024 00:32, Matthew Brost wrote:
> > > > GGTT mappings reside on the device and this state is lost during suspend
> > > > / d3cold thus this state must be restored resume regardless if the BO is
> > > > in system memory or VRAM.
> > > > 
> > > > Signed-off-by: Matthew Brost <matthew.brost at intel.com>
> > > 
> > > I think this needs fixes?
> > > 
> > 
> > I don't think we have a user of SYSTEM pinned BOs upstream on DGFX
> > platforms yet. But let me double check... If we do, then yea fixes tags
> > would be good.
> > 
> > > > ---
> > > >    drivers/gpu/drm/xe/xe_bo.c       | 14 +++++++++++---
> > > >    drivers/gpu/drm/xe/xe_bo_evict.c |  1 -
> > > >    2 files changed, 11 insertions(+), 4 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c
> > > > index bd747e53eb9b..6c8fd5ced2a2 100644
> > > > --- a/drivers/gpu/drm/xe/xe_bo.c
> > > > +++ b/drivers/gpu/drm/xe/xe_bo.c
> > > > @@ -890,8 +890,8 @@ int xe_bo_evict_pinned(struct xe_bo *bo)
> > > >    	if (WARN_ON(!xe_bo_is_pinned(bo)))
> > > >    		return -EINVAL;
> > > > -	if (WARN_ON(!xe_bo_is_vram(bo)))
> > > > -		return -EINVAL;
> > > > +	if (!xe_bo_is_vram(bo))
> > > > +		return 0;
> > > 
> > > In the suspend flow in xe_bo_evict_all() there is:
> > > 
> > > 	if (!IS_DGFX(xe))
> > > 		return 0;
> > > 
> > > But seems like we now need this flow for igpu, so need to also drop that
> > > check?
> > > 
> > 
> > I don't think so. AFIAK GGTT mappings are not blown away on iGPU
> > platforms so this isn't needed.
> 
> It sounded like GGTT lives inside stolen memory on igpu, which is not
> treated like normal RAM pages. IIRC the way this used to work was that core
> kernel reads some special registers during early boot to get the stolen
> memory range, which is then marked as reserved in e820 memory map. It then
> also sets up intel_graphics_stolen_res, which KMD can later access to get
> the stolen range. There won't be any struct page's for this range AFAIK. And
> I think this range will also then be marked as "nosave" which will then be
> ignored by core-kernel for stuff like hibernation. Also historically trying
> to access stolen directly from the CPU like normal RAM was not really always
> reliable so you had to resort to stuff like GGTT mappable aperture, which
> core-kernel wouldn't know so I think makes sense for it to leave well alone.
> See xe_ttm_stolen_cpu_access_needs_ggtt() for example. On MTL+ igpu the
> stolen handling is quite different where it looks more like VRAM and you
> access it through LMEMBAR and there is no intel_graphics_stolen_res, so not
> completely sure what else is different there...
> 
> Anyway, I guess can park for now and circle back around later once we land
> this, once we can confirm we need something for igpu case. I was just hoping
> this would also help with:

If we need to restore the GGTT, this is a larger rework. Doable but out
of scope for this patch. Will keep an eye on this.

> https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3275
>

Hmm, yea a still unsure what is going on with that gitlab but initial
refaction is this is a different issue. I guess forcing a restore of all
GGTTs might be worth a try...

Any chance I can get a RB here? Getting this in 6.13 us probably a good
idea.

Matt

> > 
> > Matt
> > 
> > > >    	if (bo->flags & XE_BO_FLAG_PINNED_WONTNEED) {
> > > >    		ttm_bo_move_null(&bo->ttm, NULL);
> > > > @@ -946,6 +946,7 @@ int xe_bo_restore_pinned(struct xe_bo *bo)
> > > >    		.interruptible = false,
> > > >    	};
> > > >    	struct ttm_resource *new_mem;
> > > > +	struct ttm_place *place = &(bo->placements[0]);
> > > >    	int ret;
> > > >    	xe_bo_assert_held(bo);
> > > > @@ -961,6 +962,9 @@ int xe_bo_restore_pinned(struct xe_bo *bo)
> > > >    			return -EINVAL;
> > > >    	}
> > > > +	if (!mem_type_is_vram(place->mem_type))
> > > > +		return 0;
> > > > +
> > > >    	ret = ttm_bo_mem_space(&bo->ttm, &bo->placement, &new_mem, &ctx);
> > > >    	if (ret)
> > > >    		return ret;
> > > > @@ -1814,7 +1818,10 @@ int xe_bo_pin(struct xe_bo *bo)
> > > >    			place->fpfn = (xe_bo_addr(bo, 0, PAGE_SIZE) -
> > > >    				       vram_region_gpu_offset(bo->ttm.resource)) >> PAGE_SHIFT;
> > > >    			place->lpfn = place->fpfn + (bo->size >> PAGE_SHIFT);
> > > > +		}
> > > > +		if (mem_type_is_vram(place->mem_type) ||
> > > > +		    bo->flags & XE_BO_FLAG_GGTT) {
> > > >    			spin_lock(&xe->pinned.lock);
> > > >    			list_add_tail(&bo->pinned_link, &xe->pinned.kernel_bo_present);
> > > >    			spin_unlock(&xe->pinned.lock);
> > > > @@ -1875,7 +1882,8 @@ void xe_bo_unpin(struct xe_bo *bo)
> > > >    	    bo->flags & XE_BO_FLAG_INTERNAL_TEST)) {
> > > >    		struct ttm_place *place = &(bo->placements[0]);
> > > > -		if (mem_type_is_vram(place->mem_type)) {
> > > > +		if (mem_type_is_vram(place->mem_type) ||
> > > > +		    bo->flags & XE_BO_FLAG_GGTT) {
> > > >    			spin_lock(&xe->pinned.lock);
> > > >    			xe_assert(xe, !list_empty(&bo->pinned_link));
> > > >    			list_del_init(&bo->pinned_link);
> > > > diff --git a/drivers/gpu/drm/xe/xe_bo_evict.c b/drivers/gpu/drm/xe/xe_bo_evict.c
> > > > index 541b49007d73..32043e1e5a86 100644
> > > > --- a/drivers/gpu/drm/xe/xe_bo_evict.c
> > > > +++ b/drivers/gpu/drm/xe/xe_bo_evict.c
> > > > @@ -159,7 +159,6 @@ int xe_bo_restore_kernel(struct xe_device *xe)
> > > >    		 * should setup the iosys map.
> > > >    		 */
> > > >    		xe_assert(xe, !iosys_map_is_null(&bo->vmap));
> > > > -		xe_assert(xe, xe_bo_is_vram(bo));
> > > >    		xe_bo_put(bo);


More information about the Intel-xe mailing list