[RFC 0/3] FW guard class

Tue Jun 18 18:08:05 UTC 2024


On 18.06.2024 03:16, Matthew Brost wrote:
> On Mon, Jun 17, 2024 at 07:54:41PM -0500, Lucas De Marchi wrote:
>> On Mon, Jun 17, 2024 at 11:30:41PM GMT, Matthew Brost wrote:
>>> On Mon, Jun 17, 2024 at 09:24:42PM +0200, Michal Wajdeczko wrote:
>>>>
>>>>
>>>> On 17.06.2024 20:00, Rodrigo Vivi wrote:
>>>>> On Mon, Jun 17, 2024 at 05:24:24PM +0000, Matthew Brost wrote:
>>>>>> On Mon, Jun 17, 2024 at 04:34:27PM +0200, Michal Wajdeczko wrote:
>>>>>>> There is support for 'classes' with constructor and destructor
>>>>>>> semantics that can be used for any scope-based resource management,
>>>>>>> like device force-wake management.
>>>>>>>
>>>>>>> Add necessary definitions explicitly, since existing macros from
>>>>>>> linux/cleanup.h can't deal with our specific requirements yet.
>>>>>>>
>>>>>>> This should allow us to use:
>>>>>>>
>>>>>>> 	scoped_guard(xe_fw, fw, XE_FW_GT)
>>>>>>> 		foo();
>>>>>>> or
>>>>>>> 	CLASS(xe_fw, var)(fw, XE_FW_GT);
>>>>>>>
>>>>>>> without any concern of leaking the force-wake references.
>>>>>>>
>>>>>>> Note: this is preliminary code as right now it's unclear how to
>>>>>>> correctly handle errors from the force-wake functions.
>>>>>>>
>>>>>>
>>>>>> I'm personally don't like this at all. IMO it obfuscate the code with
>>>>>> little real benefit. This is just an opinion though, others opinions may
>>>>>> differ from mine.
>>>>
>>>> except that is more robust than hand-crafted code that is error prone,
>>>> like this snippet from wedged_mode_set():
>>>>
>>>> 	xe_pm_runtime_get(xe);
>>>> 	for_each_gt(gt, xe, id) {
>>>> 		ret = xe_guc_ads(...);
>>>> 		if (ret) {
>>>> 			xe_gt_err(gt, "...");
>>>> 			return -EIO;
>>>> 		}
>>>> 	}
>>>> 	xe_pm_runtime_put(xe);
>>>>
>>>> and thanks to PM guard class we could avoid such mistakes for free:
>>>>
>>>> 	scoped_guard(xe_pm, xe) {
>>>> 		for_each_gt(gt, xe, id) {
>>>> 			ret = xe_guc_ads(...);
>>>> 			if (ret) {
>>>> 				xe_gt_err(gt, "...");
>>>> 				return -EIO;
>>>
>>> Just responding with a question here - haven't looked at the rest of the
>>> comments.
>>>
>>> How is this not still a bug? Looking at scoped_guard, it appears to be a
>>> magic macro for loop which acquires / releases a lock or in your
>>> purposed case a PM or FW ref. Doesn't the 'return -EIO' skip the release
>>> step? I see coding patterns like above in the kernel [1] so I do assume
>>
>> with __attribute__((cleanup)), the compiler guarantees that
>> it's executed when the variable goes out of scope. What you are probably
>> missing is the use of CLASS() declaring a variable inside the for, which
>> uses attribute cleanup:
>>
>> 	for (CLASS(_name, scope)(args),
>> 	     ...
>>
>> GCC's doc:
>>
>> https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attributes.html
>>
>> 	The cleanup attribute runs a function when the variable goes out
>> 	of scope. This attribute can only be applied to auto function
>> 	scope variables; it may not be applied to parameters or
>> 	variables with static storage duration. The function must take
>> 	one parameter, a pointer to a type compatible with the variable.
>> 	The return value of the function (if any) is ignored.
>>
>> 	When multiple variables in the same scope have cleanup
>> 	attributes, at exit from the scope their associated cleanup
>> 	functions are run in reverse order of definition (last defined,
>> 	first cleanup).
>>
>> 	If -fexceptions is enabled, then cleanup_function is run during
>> 	the stack unwinding that happens during the processing of the
>> 	exception. Note that the cleanup attribute does not allow the
>> 	exception to be caught, only to perform an action. It is
>> 	undefined what happens if cleanup_function does not return
>> 	normally.
>>
>> This was only possible with the recent change in the kernel raising
>> the minimum C std to gnu11 (uapi is still c90 for compatibility):
>>
>> 	commit e8c07082a810fbb9db303a2b66b66b8d7e588b53
>> 	Author: Arnd Bergmann <arnd at arndb.de>
>> 	Date:   Tue Mar 8 22:56:14 2022 +0100
>>
>> 	    Kbuild: move to -std=gnu11
>>
>> 	    During a patch discussion, Linus brought up the option of changing
>> 	    the C standard version from gnu89 to gnu99, which allows using variable
>> 	    declaration inside of a for() loop. While the C99, C11 and later standards
>> 	    introduce many other features, most of these are already available in
>> 	    gnu89 as GNU extensions as well.
>>
> 
> Thanks for the reference, will checkout.
> 
>>> this works, just confused how it works.
>>>
>>> With that, any code which isn't easily understandable IMO is a negative
>>> ROI as it just creates confusion in the long / makes problems harder to
>>> understand. Again this is just my opinion.
>>
>> I think that is mainly about getting used to the pattern. I think we
> 
> Agree once pattern is understood, fairly easy change.
> 
>> just have to be careful not to overshoot on trying to use everywhere.
> 
> Agree about overshooting. I think we agree we want to use cleanup.h
> semantics in Xe we should start by converting kernel core components
> (spin locks, mutexes, rwsem, kfree, etc...) to use these semantics
> before adding our own ones. Also with this, I think if we use cleanup.h
> semantics, we use these semantics everywhere we can (e.g. directly
> grabbing a spin lock is akin to open coding and not allowed).
> 
> Once we convert Xe core componets to use these semantics then we start
> adding Xe specific ones.
> 
> Thoughts?

IMO if we decide to promote new cleanup mechanism for core primitives
like mutex/spinlocks then we should already have wrappers for Xe
specific stuff like PM and FW handy to avoid mixing patterns and allow
use them together if applicable/desired:

	scoped_guard(xe_pm, xe)
		scoped_guard(xe_fw, fw, XE_FW_GT)
			scoped_guard(spinlock, &lock)
				foo();

and even if we just allow, without immediate enforcing this new cleanup
mechanism, then it still could be beneficial to introduce PM/FW support
to start closing any gaps we may have in any of these areas and to do
that in small steps whenever we make changes around problematic code

> 
>> For example, I don't know why there's already a second use in a separate
>> thread when we are still discussing it on this one.
>>
>> A very positive thing is that this is not xe's own invention and comes
> 
> I agree, it is core fairly new core thing which seems to widely
> endorsed which is a positive.
> 
>> from core kernel, maybe from the hottest path that is the scheduling and
>> locking. So I very much disagree with arguments raised here about
>> a) this is an alien thing and b) performance will be severely impacted
>>
>> I've used __attribute__((cleanup)) in several userspace projects in the
>> past and it does help avoiding problems on the error path that is
>> usually not very well tested (and xe's track record on error path is not
> 
> For sure our error paths are not great but it isn't like this solves our
> of those problems magically.
> 
>> very good either: those were the main issues being submitted in drm-xe-fixes
>> for the last release). So if we have a way to improve (and that I've already seen
>> being used successfully), I prefer failing on trying than on repeating
>> the same mistakes.  In kmod my only regret is that I didn't start it
>> earlier, during the bootstrap of the project.
> 
> This merged a little over a year, so quite new. Xe is still fairly small
> and I don't think it would be to painful to switch over to these
> semantics. If we want do this, I'd say we do this asap. 

thanks for reconsidering your initial strong position

> 
> Matt
> 
>>
>>
>> Lucas De Marchi
>>
>>
>>>
>>> Matt
>>>
>>> [1] https://elixir.bootlin.com/linux/latest/source/drivers/iio/imu/bmi323/bmi323_core.c#L1544
>>>
>>>> 			}
>>>> 		}
>>>> 	}
>>>>
>>>>>
>>>>> Well, on the positive side, it is not adding a driver only thing like
>>>>> i915's with_runtime_pm() macro.
>>>>>
>>>>> But I'm also not sure if I like the overall idea anyway:
>>>>>
>>>>> - I don't like adding C++isms in a pure C code. Specially something not
>>>>> so standard and common that will decrease the ramp-up time for newcomers.
>>>>
>>>> does it mean that the use of other guard patterns seen elsewhere in the
>>>> tree is now prohibited on the Xe driver ? like:
>>>>
>>>> 	scoped_guard(mutex, &lock)
>>>> 		foo();
>>>>
>>>> 	scoped_guard(spinlock, &lock)
>>>> 		foo();
>>>> 	...
>>>>
>>>>> - It looks like and extra overhead on the object creation destruction.
>>>>
>>>> from cleanup.h doc is sounds there is none:
>>>>
>>>>  "And through the magic of value-propagation and dead-code-elimination,
>>>> it eliminates the actual cleanup call and compiles into:"
>>>>
>>>>
>>>>> - It looks not flexible for handling different cases... like forcewake for
>>>>> instance where we might want to ignore the ack timeout in some cases.
>>>>
>>>> there is scoped_cond_guard() that likely will be able to deal with it,
>>>> but I guess we first need to cleanup existing force_wake api as expected
>>>> flow is not clear and there are different approaches in the driver how
>>>> to deal with errors
>>>>
>>>>>
>>>>>>
>>>>>> Matt
>>>>>>
>>>>>>> Cc: Rodrigo Vivi <rodrigo.vivi at intel.com>
>>>>>>> Cc: Lucas De Marchi <lucas.demarchi at intel.com>
>>>>>>>
>>>>>>> Michal Wajdeczko (3):
>>>>>>>   drm/xe: Introduce force-wake guard class
>>>>>>>   drm/xe: Use new FW guard in xe_mocs.c
>>>>>>>   drm/xe: Use new FW guard in xe_pat.c
>>>>>>>
>>>>>>>  drivers/gpu/drm/xe/xe_force_wake.h       | 48 +++++++++++++++++++
>>>>>>>  drivers/gpu/drm/xe/xe_force_wake_types.h | 12 +++++
>>>>>>>  drivers/gpu/drm/xe/xe_mocs.c             | 12 +----
>>>>>>>  drivers/gpu/drm/xe/xe_pat.c              | 60 ++++++++----------------
>>>>>>>  4 files changed, 82 insertions(+), 50 deletions(-)
>>>>>>>
>>>>>>> --
>>>>>>> 2.43.0
>>>>>>>