[PATCH 4/7] drm/ttm: move LRU walk defines into new internal header

Christian König christian.koenig at amd.com
Mon Aug 19 11:03:48 UTC 2024


Am 06.08.24 um 10:29 schrieb Thomas Hellström:
> Hi, Christian.
>
> On Thu, 2024-07-11 at 14:01 +0200, Christian König wrote:
>> Am 10.07.24 um 20:19 schrieb Matthew Brost:
>>> On Wed, Jul 10, 2024 at 02:42:58PM +0200, Christian König wrote:
>>>> That is something drivers really shouldn't mess with.
>>>>
>>> Thomas uses this in Xe to implement a shrinker [1]. Seems to need
>>> to
>>> remain available for drivers.
>> No, that is exactly what I try to prevent with that.
>>
>> This is an internally thing of TTM and drivers should never use it
>> directly.
> That driver-facing LRU walker is a direct response/solution to this
> comment that you made in the first shrinker series:
>
> https://lore.kernel.org/linux-mm/b7491378-defd-4f1c-31e2-29e4c77e2d67@amd.com/T/#ma918844aa8a6efe8768fdcda0c6590d5c93850c9

Ah, yeah that was about how we are should be avoiding middle layer design.

But a function which returns the next candidate for eviction and a 
function which calls a callback for eviction is exactly the opposite.

> That is also mentioned in the cover letter of the recent shrinker
> series, and this walker has been around in that shrinker series for
> more than half a year now so if you think it's not the correct driver
> facing API IMHO that should be addressed by a review comment in that
> series rather than in posting a conflicting patch?

I actually outlined that in the review comments for the patch series. 
E.g. a walker function with a callback is basically a middle layer.

What outlined in the link above is that a function which returns the 
next eviction candidate is a better approach than a callback.

> So assuming that we still want the driver to register the shrinker,
> IMO that helper abstracts away all the nasty locking and pitfalls for a
> driver-registered shrinker, and is similar in structure to for example
> the pagewalk helper in mm/pagewalk.c.
>
> An alternative that could be tried as a driver-facing API is to provide
> a for_each_bo_in_lru_lock() macro where the driver open-codes
> "process_bo()" inside the for loop but I tried this and found it quite
> fragile since the driver might exit the loop without performing the
> necessary cleanup.

The point is that the shrinker should *never* need to have context. E.g. 
a walker which allows going over multiple BOs for eviction is exactly 
the wrong approach for that.

The shrinker should evict always only exactly one BO and the next 
invocation of a shrinker should not depend on the result of the previous 
one.

Or am I missing something vital here?

Regards,
Christian.

>
> However with using scoped_guard() in linux/cleanup.h that could
> probably be mitigated to some exent, but I still think that a walker
> helper like this is the safer choice and given the complexity of a for_
> macro involving scoped_guard(), I think the walker helper is the
> easiest-to-maintain solution moving forward.
>
> But open to suggestions.
>
> Thanks
> Thomas
>
>
>> Regards,
>> Christian.
>>
>>> Matt
>>>
>>> [1]
>>> https://patchwork.freedesktop.org/patch/602165/?series=131815&rev=6
>>>
>>>> Signed-off-by: Christian König <christian.koenig at amd.com>
>>>> ---
>>>>    drivers/gpu/drm/ttm/ttm_bo.c      |  1 +
>>>>    drivers/gpu/drm/ttm/ttm_bo_util.c |  2 +
>>>>    drivers/gpu/drm/ttm/ttm_bo_util.h | 67
>>>> +++++++++++++++++++++++++++++++
>>>>    include/drm/ttm/ttm_bo.h          | 35 ----------------
>>>>    4 files changed, 70 insertions(+), 35 deletions(-)
>>>>    create mode 100644 drivers/gpu/drm/ttm/ttm_bo_util.h
>>>>
>>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
>>>> b/drivers/gpu/drm/ttm/ttm_bo.c
>>>> index 0131ec802066..41bee8696e69 100644
>>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>>>> @@ -45,6 +45,7 @@
>>>>    #include <linux/dma-resv.h>
>>>>    
>>>>    #include "ttm_module.h"
>>>> +#include "ttm_bo_util.h"
>>>>    
>>>>    static void ttm_bo_mem_space_debug(struct ttm_buffer_object
>>>> *bo,
>>>>    					struct ttm_placement
>>>> *placement)
>>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c
>>>> b/drivers/gpu/drm/ttm/ttm_bo_util.c
>>>> index 3c07f4712d5c..03e28e3d0d03 100644
>>>> --- a/drivers/gpu/drm/ttm/ttm_bo_util.c
>>>> +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
>>>> @@ -37,6 +37,8 @@
>>>>    
>>>>    #include <drm/drm_cache.h>
>>>>    
>>>> +#include "ttm_bo_util.h"
>>>> +
>>>>    struct ttm_transfer_obj {
>>>>    	struct ttm_buffer_object base;
>>>>    	struct ttm_buffer_object *bo;
>>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.h
>>>> b/drivers/gpu/drm/ttm/ttm_bo_util.h
>>>> new file mode 100644
>>>> index 000000000000..c19b50809208
>>>> --- /dev/null
>>>> +++ b/drivers/gpu/drm/ttm/ttm_bo_util.h
>>>> @@ -0,0 +1,67 @@
>>>> +/* SPDX-License-Identifier: GPL-2.0 OR MIT */
>>>> +/***************************************************************
>>>> ***********
>>>> + * Copyright 2024 Advanced Micro Devices, Inc.
>>>> + *
>>>> + * Permission is hereby granted, free of charge, to any person
>>>> obtaining a
>>>> + * copy of this software and associated documentation files (the
>>>> "Software"),
>>>> + * to deal in the Software without restriction, including
>>>> without limitation
>>>> + * the rights to use, copy, modify, merge, publish, distribute,
>>>> sublicense,
>>>> + * and/or sell copies of the Software, and to permit persons to
>>>> whom the
>>>> + * Software is furnished to do so, subject to the following
>>>> conditions:
>>>> + *
>>>> + * The above copyright notice and this permission notice shall
>>>> be included in
>>>> + * all copies or substantial portions of the Software.
>>>> + *
>>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY
>>>> KIND, EXPRESS OR
>>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
>>>> MERCHANTABILITY,
>>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO
>>>> EVENT SHALL
>>>> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM,
>>>> DAMAGES OR
>>>> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
>>>> OTHERWISE,
>>>> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR
>>>> THE USE OR
>>>> + * OTHER DEALINGS IN THE SOFTWARE.
>>>> + *
>>>> +
>>>> *****************************************************************
>>>> *********/
>>>> +#ifndef _TTM_BO_UTIL_H_
>>>> +#define _TTM_BO_UTIL_H_
>>>> +
>>>> +struct ww_acquire_ctx;
>>>> +
>>>> +struct ttm_buffer_object;
>>>> +struct ttm_operation_ctx;
>>>> +struct ttm_lru_walk;
>>>> +
>>>> +/** struct ttm_lru_walk_ops - Operations for a LRU walk. */
>>>> +struct ttm_lru_walk_ops {
>>>> +	/**
>>>> +	 * process_bo - Process this bo.
>>>> +	 * @walk: struct ttm_lru_walk describing the walk.
>>>> +	 * @bo: A locked and referenced buffer object.
>>>> +	 *
>>>> +	 * Return: Negative error code on error, User-defined
>>>> positive value
>>>> +	 * (typically, but not always, size of the processed bo)
>>>> on success.
>>>> +	 * On success, the returned values are summed by the
>>>> walk and the
>>>> +	 * walk exits when its target is met.
>>>> +	 * 0 also indicates success, -EBUSY means this bo was
>>>> skipped.
>>>> +	 */
>>>> +	s64 (*process_bo)(struct ttm_lru_walk *walk,
>>>> +			  struct ttm_buffer_object *bo);
>>>> +};
>>>> +
>>>> +/**
>>>> + * struct ttm_lru_walk - Structure describing a LRU walk.
>>>> + */
>>>> +struct ttm_lru_walk {
>>>> +	/** @ops: Pointer to the ops structure. */
>>>> +	const struct ttm_lru_walk_ops *ops;
>>>> +	/** @ctx: Pointer to the struct ttm_operation_ctx. */
>>>> +	struct ttm_operation_ctx *ctx;
>>>> +	/** @ticket: The struct ww_acquire_ctx if any. */
>>>> +	struct ww_acquire_ctx *ticket;
>>>> +	/** @tryock_only: Only use trylock for locking. */
>>>> +	bool trylock_only;
>>>> +};
>>>> +
>>>> +s64 ttm_lru_walk_for_evict(struct ttm_lru_walk *walk, struct
>>>> ttm_device *bdev,
>>>> +			   struct ttm_resource_manager *man, s64
>>>> target);
>>>> +
>>>> +#endif
>>>> diff --git a/include/drm/ttm/ttm_bo.h b/include/drm/ttm/ttm_bo.h
>>>> index d1a732d56259..5f7c967222a2 100644
>>>> --- a/include/drm/ttm/ttm_bo.h
>>>> +++ b/include/drm/ttm/ttm_bo.h
>>>> @@ -194,41 +194,6 @@ struct ttm_operation_ctx {
>>>>    	uint64_t bytes_moved;
>>>>    };
>>>>    
>>>> -struct ttm_lru_walk;
>>>> -
>>>> -/** struct ttm_lru_walk_ops - Operations for a LRU walk. */
>>>> -struct ttm_lru_walk_ops {
>>>> -	/**
>>>> -	 * process_bo - Process this bo.
>>>> -	 * @walk: struct ttm_lru_walk describing the walk.
>>>> -	 * @bo: A locked and referenced buffer object.
>>>> -	 *
>>>> -	 * Return: Negative error code on error, User-defined
>>>> positive value
>>>> -	 * (typically, but not always, size of the processed bo)
>>>> on success.
>>>> -	 * On success, the returned values are summed by the
>>>> walk and the
>>>> -	 * walk exits when its target is met.
>>>> -	 * 0 also indicates success, -EBUSY means this bo was
>>>> skipped.
>>>> -	 */
>>>> -	s64 (*process_bo)(struct ttm_lru_walk *walk, struct
>>>> ttm_buffer_object *bo);
>>>> -};
>>>> -
>>>> -/**
>>>> - * struct ttm_lru_walk - Structure describing a LRU walk.
>>>> - */
>>>> -struct ttm_lru_walk {
>>>> -	/** @ops: Pointer to the ops structure. */
>>>> -	const struct ttm_lru_walk_ops *ops;
>>>> -	/** @ctx: Pointer to the struct ttm_operation_ctx. */
>>>> -	struct ttm_operation_ctx *ctx;
>>>> -	/** @ticket: The struct ww_acquire_ctx if any. */
>>>> -	struct ww_acquire_ctx *ticket;
>>>> -	/** @tryock_only: Only use trylock for locking. */
>>>> -	bool trylock_only;
>>>> -};
>>>> -
>>>> -s64 ttm_lru_walk_for_evict(struct ttm_lru_walk *walk, struct
>>>> ttm_device *bdev,
>>>> -			   struct ttm_resource_manager *man, s64
>>>> target);
>>>> -
>>>>    /**
>>>>     * ttm_bo_get - reference a struct ttm_buffer_object
>>>>     *
>>>> -- 
>>>> 2.34.1
>>>>



More information about the dri-devel mailing list