[PATCH 1/9] drm/ttm: new TT backend allocation pool v2

Tue Oct 27 23:32:25 UTC 2020

On Tue, 27 Oct 2020 at 03:41, Christian König
<ckoenig.leichtzumerken at gmail.com> wrote:
>
> This replaces the spaghetti code in the two existing page pools.
>
> First of all depending on the allocation size it is between 3 (1GiB) and
> 5 (1MiB) times faster than the old implementation.
>
> It makes better use of buddy pages to allow for larger physical contiguous
> allocations which should result in better TLB utilization at least for
> amdgpu.
>
> Instead of a completely braindead approach of filling the pool with one
> CPU while another one is trying to shrink it we only give back freed
> pages.
>
> This also results in much less locking contention and a trylock free MM
> shrinker callback, so we can guarantee that pages are given back to the
> system when needed.
>
> Downside of this is that it takes longer for many small allocations until
> the pool is filled up. We could address this, but I couldn't find an use
> case where this actually matters. We also don't bother freeing large
> chunks of pages any more since the CPU overhead in that path isn't really
> that important.
>
> The sysfs files are replaced with a single module parameter, allowing
> users to override how many pages should be globally pooled in TTM. This
> unfortunately breaks the UAPI slightly, but as far as we know nobody ever
> depended on this.
>
> Zeroing memory coming from the pool was handled inconsistently. The
> alloc_pages() based pool was zeroing it, the dma_alloc_attr() based one
> wasn't. For now the new implementation isn't zeroing pages from the pool
> either and only sets the __GFP_ZERO flag when necessary.
>
> The implementation has only 768 lines of code compared to the over 2600
> of the old one, and also allows for saving quite a bunch of code in the
> drivers since we don't need specialized handling there any more based on
> kernel config.
>
> Additional to all of that there was a neat bug with IOMMU, coherent DMA
> mappings and huge pages which is now fixed in the new code as well.
>
> v2: make ttm_pool_apply_caching static as reported by the kernel bot, add
>     some more checks

#86: FILE: drivers/gpu/drm/ttm/ttm_memory.c:457:
+    ttm_pool_mgr_init(glob->zone_kernel->max_mem/(2*PAGE_SIZE));
                                                 ^

-:86: CHECK:SPACING: spaces preferred around that '*' (ctx:VxV)
#86: FILE: drivers/gpu/drm/ttm/ttm_memory.c:457:
+    ttm_pool_mgr_init(glob->zone_kernel->max_mem/(2*PAGE_SIZE));
                                                    ^

-:619: CHECK:BRACES: Blank lines aren't necessary before a close brace '}'
#619: FILE: drivers/gpu/drm/ttm/ttm_pool.c:516:
+
+}

-:845: CHECK:UNCOMMENTED_DEFINITION: spinlock_t definition without comment
#845: FILE: include/drm/ttm/ttm_pool.h:55:
+    spinlock_t lock;

Would be good to get those cleaned up, otherwise

Reviewed-by: Dave Airlie <airlied at redhat.com>

for the series.

Dave.