[PATCH v3 2/2] drm/xe: Clear GGTT in xe_bo_restore_kernel

Matthew Auld matthew.auld at intel.com
Wed Nov 13 16:44:12 UTC 2024


On 13/11/2024 14:43, Matthew Brost wrote:
> On Wed, Nov 13, 2024 at 12:55:35PM +0000, Matthew Auld wrote:
>> On 06/11/2024 18:35, Matthew Brost wrote:
>>> Part of what xe_bo_restore_kernel does, is restore BO's GGTT mappings
>>> which may have been lost during a power state change. Missing is
>>> restoring the GGTT entries without BO mappings to a known state (e.g.,
>>> scratch pages). Update xe_bo_restore_kernel to clear the entire GGTT
>>> before restoring BO's GGTT mappings.
>>>
>>> v2:
>>>    - Include missing local change of tile and id variable (CI)
>>> v3:
>>>    - Fixed kernel doc (CI)
>>>
>>> Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
>>> Cc: Lucas De Marchi <lucas.demarchi at intel.com>
>>> Cc: Matthew Auld <matthew.auld at intel.com>
>>> Signed-off-by: Matthew Brost <matthew.brost at intel.com>
>>> Cc: <stable at vger.kernel.org> # v6.8+
>> Reviewed-by: Matthew Auld <matthew.auld at intel.com>
>>
> 
> Thanks for the review but I'm pretty sure this is breaking display... See
> our CI runs, hard to exactly tell because of the instability there. I'm
> thinking only the parts GGTT without nodes need to be cleared, i.e., do
> what xe_ggtt_initial_clear does.

Yeah, could try xe_ggtt_initial_clear() instead.

> 
> I am a little unsure how the display code restores it GGTT mappings
> though as it appears that code bypasses BOs? Do you have any idea?

AFAICT the fb should be unpinned before suspend, that way the evict_all 
will see it and evict to system memory for the dgpu case where VRAM must 
be used. I think xe_display_pm_resume() will then validate and re-pin it 
(eventually __xe_pin_fb_vma), which should in theory re-program the 
GGTT, and that should be after we drop the GGTT nuke here.

There are so many layers but I see:

  __xe_display_pm_resume()
    intel_display_driver_resume()
     __intel_display_driver_resume()
       drm_atomic_helper_commit_duplicated_state()
         drm_atomic_commit()

Which eventually gets to intel_prepare_plane_fb()/intel_plane_pin_fb() 
and then finally __xe_pin_fb_vma().

Maybe something else? In GGTT init we only seem to setup a subset of the 
total GGTT size with some of it being reserved for other stuff it seems. 
So even xe_ggtt_initial_clear() doesn't seem to nuke everything. Not 
sure why that would matter though.

> 
> Matt



More information about the Intel-xe mailing list