[PATCH 4/7] drm/ttm: move LRU walk defines into new internal header
Christian König
christian.koenig at amd.com
Mon Aug 19 11:03:48 UTC 2024
Am 06.08.24 um 10:29 schrieb Thomas Hellström:
> Hi, Christian.
>
> On Thu, 2024-07-11 at 14:01 +0200, Christian König wrote:
>> Am 10.07.24 um 20:19 schrieb Matthew Brost:
>>> On Wed, Jul 10, 2024 at 02:42:58PM +0200, Christian König wrote:
>>>> That is something drivers really shouldn't mess with.
>>>>
>>> Thomas uses this in Xe to implement a shrinker [1]. Seems to need
>>> to
>>> remain available for drivers.
>> No, that is exactly what I try to prevent with that.
>>
>> This is an internally thing of TTM and drivers should never use it
>> directly.
> That driver-facing LRU walker is a direct response/solution to this
> comment that you made in the first shrinker series:
>
> https://lore.kernel.org/linux-mm/b7491378-defd-4f1c-31e2-29e4c77e2d67@amd.com/T/#ma918844aa8a6efe8768fdcda0c6590d5c93850c9
Ah, yeah that was about how we are should be avoiding middle layer design.
But a function which returns the next candidate for eviction and a
function which calls a callback for eviction is exactly the opposite.
> That is also mentioned in the cover letter of the recent shrinker
> series, and this walker has been around in that shrinker series for
> more than half a year now so if you think it's not the correct driver
> facing API IMHO that should be addressed by a review comment in that
> series rather than in posting a conflicting patch?
I actually outlined that in the review comments for the patch series.
E.g. a walker function with a callback is basically a middle layer.
What outlined in the link above is that a function which returns the
next eviction candidate is a better approach than a callback.
> So assuming that we still want the driver to register the shrinker,
> IMO that helper abstracts away all the nasty locking and pitfalls for a
> driver-registered shrinker, and is similar in structure to for example
> the pagewalk helper in mm/pagewalk.c.
>
> An alternative that could be tried as a driver-facing API is to provide
> a for_each_bo_in_lru_lock() macro where the driver open-codes
> "process_bo()" inside the for loop but I tried this and found it quite
> fragile since the driver might exit the loop without performing the
> necessary cleanup.
The point is that the shrinker should *never* need to have context. E.g.
a walker which allows going over multiple BOs for eviction is exactly
the wrong approach for that.
The shrinker should evict always only exactly one BO and the next
invocation of a shrinker should not depend on the result of the previous
one.
Or am I missing something vital here?
Regards,
Christian.
>
> However with using scoped_guard() in linux/cleanup.h that could
> probably be mitigated to some exent, but I still think that a walker
> helper like this is the safer choice and given the complexity of a for_
> macro involving scoped_guard(), I think the walker helper is the
> easiest-to-maintain solution moving forward.
>
> But open to suggestions.
>
> Thanks
> Thomas
>
>
>> Regards,
>> Christian.
>>
>>> Matt
>>>
>>> [1]
>>> https://patchwork.freedesktop.org/patch/602165/?series=131815&rev=6
>>>
>>>> Signed-off-by: Christian König <christian.koenig at amd.com>
>>>> ---
>>>> drivers/gpu/drm/ttm/ttm_bo.c | 1 +
>>>> drivers/gpu/drm/ttm/ttm_bo_util.c | 2 +
>>>> drivers/gpu/drm/ttm/ttm_bo_util.h | 67
>>>> +++++++++++++++++++++++++++++++
>>>> include/drm/ttm/ttm_bo.h | 35 ----------------
>>>> 4 files changed, 70 insertions(+), 35 deletions(-)
>>>> create mode 100644 drivers/gpu/drm/ttm/ttm_bo_util.h
>>>>
>>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
>>>> b/drivers/gpu/drm/ttm/ttm_bo.c
>>>> index 0131ec802066..41bee8696e69 100644
>>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>>>> @@ -45,6 +45,7 @@
>>>> #include <linux/dma-resv.h>
>>>>
>>>> #include "ttm_module.h"
>>>> +#include "ttm_bo_util.h"
>>>>
>>>> static void ttm_bo_mem_space_debug(struct ttm_buffer_object
>>>> *bo,
>>>> struct ttm_placement
>>>> *placement)
>>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c
>>>> b/drivers/gpu/drm/ttm/ttm_bo_util.c
>>>> index 3c07f4712d5c..03e28e3d0d03 100644
>>>> --- a/drivers/gpu/drm/ttm/ttm_bo_util.c
>>>> +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
>>>> @@ -37,6 +37,8 @@
>>>>
>>>> #include <drm/drm_cache.h>
>>>>
>>>> +#include "ttm_bo_util.h"
>>>> +
>>>> struct ttm_transfer_obj {
>>>> struct ttm_buffer_object base;
>>>> struct ttm_buffer_object *bo;
>>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.h
>>>> b/drivers/gpu/drm/ttm/ttm_bo_util.h
>>>> new file mode 100644
>>>> index 000000000000..c19b50809208
>>>> --- /dev/null
>>>> +++ b/drivers/gpu/drm/ttm/ttm_bo_util.h
>>>> @@ -0,0 +1,67 @@
>>>> +/* SPDX-License-Identifier: GPL-2.0 OR MIT */
>>>> +/***************************************************************
>>>> ***********
>>>> + * Copyright 2024 Advanced Micro Devices, Inc.
>>>> + *
>>>> + * Permission is hereby granted, free of charge, to any person
>>>> obtaining a
>>>> + * copy of this software and associated documentation files (the
>>>> "Software"),
>>>> + * to deal in the Software without restriction, including
>>>> without limitation
>>>> + * the rights to use, copy, modify, merge, publish, distribute,
>>>> sublicense,
>>>> + * and/or sell copies of the Software, and to permit persons to
>>>> whom the
>>>> + * Software is furnished to do so, subject to the following
>>>> conditions:
>>>> + *
>>>> + * The above copyright notice and this permission notice shall
>>>> be included in
>>>> + * all copies or substantial portions of the Software.
>>>> + *
>>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY
>>>> KIND, EXPRESS OR
>>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
>>>> MERCHANTABILITY,
>>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO
>>>> EVENT SHALL
>>>> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM,
>>>> DAMAGES OR
>>>> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
>>>> OTHERWISE,
>>>> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR
>>>> THE USE OR
>>>> + * OTHER DEALINGS IN THE SOFTWARE.
>>>> + *
>>>> +
>>>> *****************************************************************
>>>> *********/
>>>> +#ifndef _TTM_BO_UTIL_H_
>>>> +#define _TTM_BO_UTIL_H_
>>>> +
>>>> +struct ww_acquire_ctx;
>>>> +
>>>> +struct ttm_buffer_object;
>>>> +struct ttm_operation_ctx;
>>>> +struct ttm_lru_walk;
>>>> +
>>>> +/** struct ttm_lru_walk_ops - Operations for a LRU walk. */
>>>> +struct ttm_lru_walk_ops {
>>>> + /**
>>>> + * process_bo - Process this bo.
>>>> + * @walk: struct ttm_lru_walk describing the walk.
>>>> + * @bo: A locked and referenced buffer object.
>>>> + *
>>>> + * Return: Negative error code on error, User-defined
>>>> positive value
>>>> + * (typically, but not always, size of the processed bo)
>>>> on success.
>>>> + * On success, the returned values are summed by the
>>>> walk and the
>>>> + * walk exits when its target is met.
>>>> + * 0 also indicates success, -EBUSY means this bo was
>>>> skipped.
>>>> + */
>>>> + s64 (*process_bo)(struct ttm_lru_walk *walk,
>>>> + struct ttm_buffer_object *bo);
>>>> +};
>>>> +
>>>> +/**
>>>> + * struct ttm_lru_walk - Structure describing a LRU walk.
>>>> + */
>>>> +struct ttm_lru_walk {
>>>> + /** @ops: Pointer to the ops structure. */
>>>> + const struct ttm_lru_walk_ops *ops;
>>>> + /** @ctx: Pointer to the struct ttm_operation_ctx. */
>>>> + struct ttm_operation_ctx *ctx;
>>>> + /** @ticket: The struct ww_acquire_ctx if any. */
>>>> + struct ww_acquire_ctx *ticket;
>>>> + /** @tryock_only: Only use trylock for locking. */
>>>> + bool trylock_only;
>>>> +};
>>>> +
>>>> +s64 ttm_lru_walk_for_evict(struct ttm_lru_walk *walk, struct
>>>> ttm_device *bdev,
>>>> + struct ttm_resource_manager *man, s64
>>>> target);
>>>> +
>>>> +#endif
>>>> diff --git a/include/drm/ttm/ttm_bo.h b/include/drm/ttm/ttm_bo.h
>>>> index d1a732d56259..5f7c967222a2 100644
>>>> --- a/include/drm/ttm/ttm_bo.h
>>>> +++ b/include/drm/ttm/ttm_bo.h
>>>> @@ -194,41 +194,6 @@ struct ttm_operation_ctx {
>>>> uint64_t bytes_moved;
>>>> };
>>>>
>>>> -struct ttm_lru_walk;
>>>> -
>>>> -/** struct ttm_lru_walk_ops - Operations for a LRU walk. */
>>>> -struct ttm_lru_walk_ops {
>>>> - /**
>>>> - * process_bo - Process this bo.
>>>> - * @walk: struct ttm_lru_walk describing the walk.
>>>> - * @bo: A locked and referenced buffer object.
>>>> - *
>>>> - * Return: Negative error code on error, User-defined
>>>> positive value
>>>> - * (typically, but not always, size of the processed bo)
>>>> on success.
>>>> - * On success, the returned values are summed by the
>>>> walk and the
>>>> - * walk exits when its target is met.
>>>> - * 0 also indicates success, -EBUSY means this bo was
>>>> skipped.
>>>> - */
>>>> - s64 (*process_bo)(struct ttm_lru_walk *walk, struct
>>>> ttm_buffer_object *bo);
>>>> -};
>>>> -
>>>> -/**
>>>> - * struct ttm_lru_walk - Structure describing a LRU walk.
>>>> - */
>>>> -struct ttm_lru_walk {
>>>> - /** @ops: Pointer to the ops structure. */
>>>> - const struct ttm_lru_walk_ops *ops;
>>>> - /** @ctx: Pointer to the struct ttm_operation_ctx. */
>>>> - struct ttm_operation_ctx *ctx;
>>>> - /** @ticket: The struct ww_acquire_ctx if any. */
>>>> - struct ww_acquire_ctx *ticket;
>>>> - /** @tryock_only: Only use trylock for locking. */
>>>> - bool trylock_only;
>>>> -};
>>>> -
>>>> -s64 ttm_lru_walk_for_evict(struct ttm_lru_walk *walk, struct
>>>> ttm_device *bdev,
>>>> - struct ttm_resource_manager *man, s64
>>>> target);
>>>> -
>>>> /**
>>>> * ttm_bo_get - reference a struct ttm_buffer_object
>>>> *
>>>> --
>>>> 2.34.1
>>>>
More information about the dri-devel
mailing list