[Intel-gfx] [RFC 00/12] locking: Separate lock tracepoints from lockdep/lock_stat (v1)

Thu Feb 10 09:13:53 UTC 2022

On Wed, Feb 09, 2022 at 04:32:58PM -0800, Namhyung Kim wrote:
> On Wed, Feb 9, 2022 at 1:09 AM Peter Zijlstra <peterz at infradead.org> wrote:
> >
> > On Tue, Feb 08, 2022 at 10:41:56AM -0800, Namhyung Kim wrote:
> >
> > > Eventually I'm mostly interested in the contended locks only and I
> > > want to reduce the overhead in the fast path.  By moving that, it'd be
> > > easy to track contended locks with timing by using two tracepoints.
> >
> > So why not put in two new tracepoints and call it a day?
> >
> > Why muck about with all that lockdep stuff just to preserve the name
> > (and in the process continue to blow up data structures etc..). This
> > leaves distros in a bind, will they enable this config and provide
> > tracepoints while bloating the data structures and destroying things
> > like lockref (which relies on sizeof(spinlock_t)), or not provide this
> > at all.
> 
> If it's only lockref, is it possible to change it to use arch_spinlock_t
> so that it can remain in 4 bytes?  It'd be really nice if we can keep
> spin lock size, but it'd be easier to carry the name with it for
> analysis IMHO.

It's just vile and disgusting to blow up the lock size for convenience
like this.

And no, there's more of that around. A lot of effort has been spend to
make sure spinlocks are 32bit and we're not going to give that up for
something as daft as this.

Just think harder on the analysis side. Like said; I'm thinking the
caller IP should be good enough most of the time.