[Intel-gfx] [RFC PATCH v2 0/1] Replace shmem memory region and object backend
Matthew Auld
matthew.william.auld at gmail.com
Wed May 18 15:00:32 UTC 2022
On Tue, 17 May 2022 at 21:45, Adrian Larumbe
<adrian.larumbe at collabora.com> wrote:
>
> This patch is a second attempt at eliminating the old shmem memory region
> and GEM object backend, in favour of a TTM-based one that is able to manage
> objects placed on both system and local memory.
>
> Questions addressed since previous revision:
>
> * Creating an anonymous vfs mount for shmem files in TTM
> * Fixing LLC caching properties and bit 17 swizzling before setting a TTM
> bo's pages when calling get_pages
> * Added handling of phys backend from TTM functions
> * Added pread callback to TTM gem object backend
> * In shmem_create_from_object, ensuring an shmem object we just got a filp
> for has its pages marked dirty and accessed. Otherwise, the engine won't be
> able to read the initial state and a GPU hung will ensue
>
> However, one of the issues persists:
>
> Many GPU hungs in machines of GEN <= 5. My assumption is this has something
> to do with a caching pitfall, but everywhere across the TTM backend code
> I've tried to handle object creation and getting its pages with the same
> set of caching and coherency properties as in the old shmem backend.
Some thoughts in case it's helpful:
- We still look to be trampling the cache_level etc after object
creation. AFAICT i915_ttm_adjust_gem_after_move can be called in
various places after creation.
- The i915_ttm_pwrite hook won't play nice on non-llc platforms, since
it doesn't force a clflush or keep track of the writes with
cache_dirty. The existing ->shmem_pwrite hook only works because we
are guaranteed to have not yet populated the mm.pages, and on non-llc
platforms we always force a clflush in __set_pages(). In
i915_ttm_pwrite we are now just calling pin_pages() and then writing
through the page-cache without forcing a clflush, or ensuring that we
leave cache_dirty set. Also AFAIK the whole point of shmem_pwrite was
to avoid needing to populate the entire object like when calling
pin_pages(). Would it make sense to just fallback to using
i915_gem_shmem_pwrite, which should already take care of the required
flushing?
For reference a common usage pattern is something like:
bb = gem_create() <-- assume non-llc so must be CACHE_NONE
gem_write(bb, BATCH_BUFFER_END) <-- might use cached pwrite internally
execbuf(bb) <-- doesn't see BATCH_BUFFER_END if we don't clflush
>
> Adrian Larumbe (1):
> drm/i915: Replace shmem memory region and object backend with TTM
>
> drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 12 +-
> drivers/gpu/drm/i915/gem/i915_gem_mman.c | 32 +-
> drivers/gpu/drm/i915/gem/i915_gem_object.h | 4 +-
> drivers/gpu/drm/i915/gem/i915_gem_phys.c | 32 +-
> drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 390 +------------------
> drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 267 ++++++++++++-
> drivers/gpu/drm/i915/gem/i915_gem_ttm.h | 3 +
> drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c | 9 +-
> drivers/gpu/drm/i915/gt/shmem_utils.c | 64 ++-
> drivers/gpu/drm/i915/intel_memory_region.c | 7 +-
> 10 files changed, 398 insertions(+), 422 deletions(-)
>
> --
> 2.35.1
>
More information about the Intel-gfx
mailing list