[PATCH v3 0/8] drm: Introduce sparse GEM shmem

Mon Apr 14 13:31:01 UTC 2025

On Mon, 14 Apr 2025 09:03:55 -0400
Alyssa Rosenzweig <alyssa at rosenzweig.io> wrote:

> > Actually, CSF stands in the way of re-allocating memory to other
> > contexts, because once we've allocated memory to a tiler heap, the FW
> > manages this pool of chunks, and recycles them. Mesa can intercept
> > the "returned chunks" and collect those chunks instead of re-assiging
> > then to the tiler heap through a CS instruction (which goes thought
> > the FW internallu), but that involves extra collaboration between the
> > UMD, KMD and FW which we don't have at the moment. Not saying never,
> > but I'd rather fix things gradually (first the blocking alloc in the
> > fence-signalling path, then the optimization to share the extra mem
> > reservation cost among contexts by returning the chunks to the global
> > kernel pool rather than directly to the heap).
> > 
> > This approach should work fine with JM GPUs where the tiler heap is
> > entirely managed by the KMD though.  
> 
> I really think CSF should be relying on the simple heuristics with
> incremental-rendering, unless you can prove that's actually a
> performance issue in practice. (On Imagination/Apple parts, it almost
> never is and we rely entirely on this approach. It's ok - it really is.
> For simple 2D workloads, the initial heap allocation is fine. For 3D
> scenes, we need very few frames to get the right size. this doesn't
> cause stutters in practice.)

Yep I agree, hence the "let's try the simple thing first and let's see
if we actually need the more complex stuff later". My hope is that we'll
never need it, but I hate to make definitive statements, because it
usually bites me back when I do :P.

> 
> For JM .. yes, this discussion remains relevant of course.

I'm still trying to see if we can emulate/have incremental-rendering on
JM hardware, so it really becomes a Lima-only issue. According to Erik,
predicting how much heap is needed is much more predictible on Utgard
(no indirect draws, simpler binning hierarchy, and other details he
mentioned which I forgot).