[RFC 0/3] FW guard class

Tue Jun 18 18:44:36 UTC 2024

On Tue, Jun 18, 2024 at 08:08:05PM +0200, Michal Wajdeczko wrote:
> 
> 
> On 18.06.2024 03:16, Matthew Brost wrote:
> > On Mon, Jun 17, 2024 at 07:54:41PM -0500, Lucas De Marchi wrote:
> >> On Mon, Jun 17, 2024 at 11:30:41PM GMT, Matthew Brost wrote:
> >>> On Mon, Jun 17, 2024 at 09:24:42PM +0200, Michal Wajdeczko wrote:
> >>>>
> >>>>
> >>>> On 17.06.2024 20:00, Rodrigo Vivi wrote:
> >>>>> On Mon, Jun 17, 2024 at 05:24:24PM +0000, Matthew Brost wrote:
> >>>>>> On Mon, Jun 17, 2024 at 04:34:27PM +0200, Michal Wajdeczko wrote:
> >>>>>>> There is support for 'classes' with constructor and destructor
> >>>>>>> semantics that can be used for any scope-based resource management,
> >>>>>>> like device force-wake management.
> >>>>>>>
> >>>>>>> Add necessary definitions explicitly, since existing macros from
> >>>>>>> linux/cleanup.h can't deal with our specific requirements yet.
> >>>>>>>
> >>>>>>> This should allow us to use:
> >>>>>>>
> >>>>>>> 	scoped_guard(xe_fw, fw, XE_FW_GT)
> >>>>>>> 		foo();
> >>>>>>> or
> >>>>>>> 	CLASS(xe_fw, var)(fw, XE_FW_GT);
> >>>>>>>
> >>>>>>> without any concern of leaking the force-wake references.
> >>>>>>>
> >>>>>>> Note: this is preliminary code as right now it's unclear how to
> >>>>>>> correctly handle errors from the force-wake functions.
> >>>>>>>
> >>>>>>
> >>>>>> I'm personally don't like this at all. IMO it obfuscate the code with
> >>>>>> little real benefit. This is just an opinion though, others opinions may
> >>>>>> differ from mine.
> >>>>
> >>>> except that is more robust than hand-crafted code that is error prone,
> >>>> like this snippet from wedged_mode_set():
> >>>>
> >>>> 	xe_pm_runtime_get(xe);
> >>>> 	for_each_gt(gt, xe, id) {
> >>>> 		ret = xe_guc_ads(...);
> >>>> 		if (ret) {
> >>>> 			xe_gt_err(gt, "...");
> >>>> 			return -EIO;
> >>>> 		}
> >>>> 	}
> >>>> 	xe_pm_runtime_put(xe);
> >>>>
> >>>> and thanks to PM guard class we could avoid such mistakes for free:
> >>>>
> >>>> 	scoped_guard(xe_pm, xe) {
> >>>> 		for_each_gt(gt, xe, id) {
> >>>> 			ret = xe_guc_ads(...);
> >>>> 			if (ret) {
> >>>> 				xe_gt_err(gt, "...");
> >>>> 				return -EIO;
> >>>
> >>> Just responding with a question here - haven't looked at the rest of the
> >>> comments.
> >>>
> >>> How is this not still a bug? Looking at scoped_guard, it appears to be a
> >>> magic macro for loop which acquires / releases a lock or in your
> >>> purposed case a PM or FW ref. Doesn't the 'return -EIO' skip the release
> >>> step? I see coding patterns like above in the kernel [1] so I do assume
> >>
> >> with __attribute__((cleanup)), the compiler guarantees that
> >> it's executed when the variable goes out of scope. What you are probably
> >> missing is the use of CLASS() declaring a variable inside the for, which
> >> uses attribute cleanup:
> >>
> >> 	for (CLASS(_name, scope)(args),
> >> 	     ...
> >>
> >> GCC's doc:
> >>
> >> https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attributes.html
> >>
> >> 	The cleanup attribute runs a function when the variable goes out
> >> 	of scope. This attribute can only be applied to auto function
> >> 	scope variables; it may not be applied to parameters or
> >> 	variables with static storage duration. The function must take
> >> 	one parameter, a pointer to a type compatible with the variable.
> >> 	The return value of the function (if any) is ignored.
> >>
> >> 	When multiple variables in the same scope have cleanup
> >> 	attributes, at exit from the scope their associated cleanup
> >> 	functions are run in reverse order of definition (last defined,
> >> 	first cleanup).
> >>
> >> 	If -fexceptions is enabled, then cleanup_function is run during
> >> 	the stack unwinding that happens during the processing of the
> >> 	exception. Note that the cleanup attribute does not allow the
> >> 	exception to be caught, only to perform an action. It is
> >> 	undefined what happens if cleanup_function does not return
> >> 	normally.
> >>
> >> This was only possible with the recent change in the kernel raising
> >> the minimum C std to gnu11 (uapi is still c90 for compatibility):
> >>
> >> 	commit e8c07082a810fbb9db303a2b66b66b8d7e588b53
> >> 	Author: Arnd Bergmann <arnd at arndb.de>
> >> 	Date:   Tue Mar 8 22:56:14 2022 +0100
> >>
> >> 	    Kbuild: move to -std=gnu11
> >>
> >> 	    During a patch discussion, Linus brought up the option of changing
> >> 	    the C standard version from gnu89 to gnu99, which allows using variable
> >> 	    declaration inside of a for() loop. While the C99, C11 and later standards
> >> 	    introduce many other features, most of these are already available in
> >> 	    gnu89 as GNU extensions as well.
> >>
> > 
> > Thanks for the reference, will checkout.
> > 
> >>> this works, just confused how it works.
> >>>
> >>> With that, any code which isn't easily understandable IMO is a negative
> >>> ROI as it just creates confusion in the long / makes problems harder to
> >>> understand. Again this is just my opinion.
> >>
> >> I think that is mainly about getting used to the pattern. I think we
> > 
> > Agree once pattern is understood, fairly easy change.
> > 
> >> just have to be careful not to overshoot on trying to use everywhere.
> > 
> > Agree about overshooting. I think we agree we want to use cleanup.h
> > semantics in Xe we should start by converting kernel core components
> > (spin locks, mutexes, rwsem, kfree, etc...) to use these semantics
> > before adding our own ones. Also with this, I think if we use cleanup.h
> > semantics, we use these semantics everywhere we can (e.g. directly
> > grabbing a spin lock is akin to open coding and not allowed).
> > 
> > Once we convert Xe core componets to use these semantics then we start
> > adding Xe specific ones.
> > 
> > Thoughts?
> 
> IMO if we decide to promote new cleanup mechanism for core primitives
> like mutex/spinlocks then we should already have wrappers for Xe
> specific stuff like PM and FW handy to avoid mixing patterns and allow
> use them together if applicable/desired:
> 
> 	scoped_guard(xe_pm, xe)
> 		scoped_guard(xe_fw, fw, XE_FW_GT)
> 			scoped_guard(spinlock, &lock)
> 				foo();
> 
> and even if we just allow, without immediate enforcing this new cleanup
> mechanism, then it still could be beneficial to introduce PM/FW support
> to start closing any gaps we may have in any of these areas and to do
> that in small steps whenever we make changes around problematic code

I believe that mixing styles (like goto/unwind vs. RAII) within a single
code base will only lead to overall confusion. Personally, I strongly
oppose adopting RAII semantics halfway in Xe. In my opinion, we should
either fully commit to this approach or not at all. There will likely
always be exceptions where RAII doesn't quite fit, although it's unclear
how often that will occur. Therefore, a quick RFC to convert some core
components to RAII could be helpful in visualizing the impact on the
code base.`

Additionally, there's the question of how far we should extend this
approach. Should we create RAII helpers for get/put operations of local
objects in functions, etc.? Is this even an acceptable usage?

Another consideration is whether the larger DRM community is open to
this change (i.e., can we begin using RAII semantics in DRM code)? A
cursory search on Linux cross-referencer doesn't show any RAII usage in
DRM code.

> 
> > 
> >> For example, I don't know why there's already a second use in a separate
> >> thread when we are still discussing it on this one.
> >>
> >> A very positive thing is that this is not xe's own invention and comes
> > 
> > I agree, it is core fairly new core thing which seems to widely
> > endorsed which is a positive.
> > 
> >> from core kernel, maybe from the hottest path that is the scheduling and
> >> locking. So I very much disagree with arguments raised here about
> >> a) this is an alien thing and b) performance will be severely impacted
> >>
> >> I've used __attribute__((cleanup)) in several userspace projects in the
> >> past and it does help avoiding problems on the error path that is
> >> usually not very well tested (and xe's track record on error path is not
> > 
> > For sure our error paths are not great but it isn't like this solves our
> > of those problems magically.
> > 
> >> very good either: those were the main issues being submitted in drm-xe-fixes
> >> for the last release). So if we have a way to improve (and that I've already seen
> >> being used successfully), I prefer failing on trying than on repeating
> >> the same mistakes.  In kmod my only regret is that I didn't start it
> >> earlier, during the bootstrap of the project.
> > 
> > This merged a little over a year, so quite new. Xe is still fairly small
> > and I don't think it would be to painful to switch over to these
> > semantics. If we want do this, I'd say we do this asap. 
> 
> thanks for reconsidering your initial strong position
> 

After reading up on this a bit more, I'm coming around to it. However, I
can't say I'm completely sold on it. Additionally, since I'm not a
maintainer of Xe, ultimately it's not up to me either.

Matt

> > 
> > Matt
> > 
> >>
> >>
> >> Lucas De Marchi
> >>
> >>
> >>>
> >>> Matt
> >>>
> >>> [1] https://elixir.bootlin.com/linux/latest/source/drivers/iio/imu/bmi323/bmi323_core.c#L1544
> >>>
> >>>> 			}
> >>>> 		}
> >>>> 	}
> >>>>
> >>>>>
> >>>>> Well, on the positive side, it is not adding a driver only thing like
> >>>>> i915's with_runtime_pm() macro.
> >>>>>
> >>>>> But I'm also not sure if I like the overall idea anyway:
> >>>>>
> >>>>> - I don't like adding C++isms in a pure C code. Specially something not
> >>>>> so standard and common that will decrease the ramp-up time for newcomers.
> >>>>
> >>>> does it mean that the use of other guard patterns seen elsewhere in the
> >>>> tree is now prohibited on the Xe driver ? like:
> >>>>
> >>>> 	scoped_guard(mutex, &lock)
> >>>> 		foo();
> >>>>
> >>>> 	scoped_guard(spinlock, &lock)
> >>>> 		foo();
> >>>> 	...
> >>>>
> >>>>> - It looks like and extra overhead on the object creation destruction.
> >>>>
> >>>> from cleanup.h doc is sounds there is none:
> >>>>
> >>>>  "And through the magic of value-propagation and dead-code-elimination,
> >>>> it eliminates the actual cleanup call and compiles into:"
> >>>>
> >>>>
> >>>>> - It looks not flexible for handling different cases... like forcewake for
> >>>>> instance where we might want to ignore the ack timeout in some cases.
> >>>>
> >>>> there is scoped_cond_guard() that likely will be able to deal with it,
> >>>> but I guess we first need to cleanup existing force_wake api as expected
> >>>> flow is not clear and there are different approaches in the driver how
> >>>> to deal with errors
> >>>>
> >>>>>
> >>>>>>
> >>>>>> Matt
> >>>>>>
> >>>>>>> Cc: Rodrigo Vivi <rodrigo.vivi at intel.com>
> >>>>>>> Cc: Lucas De Marchi <lucas.demarchi at intel.com>
> >>>>>>>
> >>>>>>> Michal Wajdeczko (3):
> >>>>>>>   drm/xe: Introduce force-wake guard class
> >>>>>>>   drm/xe: Use new FW guard in xe_mocs.c
> >>>>>>>   drm/xe: Use new FW guard in xe_pat.c
> >>>>>>>
> >>>>>>>  drivers/gpu/drm/xe/xe_force_wake.h       | 48 +++++++++++++++++++
> >>>>>>>  drivers/gpu/drm/xe/xe_force_wake_types.h | 12 +++++
> >>>>>>>  drivers/gpu/drm/xe/xe_mocs.c             | 12 +----
> >>>>>>>  drivers/gpu/drm/xe/xe_pat.c              | 60 ++++++++----------------
> >>>>>>>  4 files changed, 82 insertions(+), 50 deletions(-)
> >>>>>>>
> >>>>>>> --
> >>>>>>> 2.43.0
> >>>>>>>