[PATCH v2 4/8] x86, lib: Add WBNOINVD helper functions

Sean Christopherson seanjc at google.com
Mon May 19 16:30:34 UTC 2025


On Sat, May 17, 2025, Ingo Molnar wrote:
> 
> * Sean Christopherson <seanjc at google.com> wrote:
> 
> > From: Kevin Loughlin <kevinloughlin at google.com>
> > 
> > In line with WBINVD usage, add WBONINVD helper functions.  Fall back to
> > WBINVD (via alternative()) if WBNOINVD isn't supported, as WBINVD provides
> > a superset of functionality, just more slowly.
> > 
> > Note, alternative() ensures compatibility with early boot code as needed.
> > 
> > Signed-off-by: Kevin Loughlin <kevinloughlin at google.com>
> > Reviewed-by: Tom Lendacky <thomas.lendacky at amd.com>
> > [sean: massage changelog and comments, use ASM_WBNOINVD and _ASM_BYTES]
> > Reviewed-by: Kai Huang <kai.huang at intel.com>
> > Signed-off-by: Sean Christopherson <seanjc at google.com>
> > ---
> >  arch/x86/include/asm/smp.h           |  6 ++++++
> >  arch/x86/include/asm/special_insns.h | 19 ++++++++++++++++++-
> >  arch/x86/lib/cache-smp.c             | 11 +++++++++++
> >  3 files changed, 35 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
> > index 028f126018c9..e08f1ae25401 100644
> > --- a/arch/x86/include/asm/smp.h
> > +++ b/arch/x86/include/asm/smp.h
> > @@ -113,6 +113,7 @@ void native_play_dead(void);
> >  void play_dead_common(void);
> >  void wbinvd_on_cpu(int cpu);
> >  void wbinvd_on_all_cpus(void);
> > +void wbnoinvd_on_all_cpus(void);
> >  
> >  void smp_kick_mwait_play_dead(void);
> >  void __noreturn mwait_play_dead(unsigned int eax_hint);
> > @@ -153,6 +154,11 @@ static inline void wbinvd_on_all_cpus(void)
> >  	wbinvd();
> >  }
> >  
> > +static inline void wbnoinvd_on_all_cpus(void)
> > +{
> > +	wbnoinvd();
> > +}
> > +
> >  static inline struct cpumask *cpu_llc_shared_mask(int cpu)
> >  {
> >  	return (struct cpumask *)cpumask_of(0);
> > diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
> > index 6266d6b9e0b8..46b3961e3e4b 100644
> > --- a/arch/x86/include/asm/special_insns.h
> > +++ b/arch/x86/include/asm/special_insns.h
> > @@ -117,7 +117,24 @@ static inline void wrpkru(u32 pkru)
> >  
> >  static __always_inline void wbinvd(void)
> >  {
> > -	asm volatile("wbinvd": : :"memory");
> > +	asm volatile("wbinvd" : : : "memory");
> > +}
> > +
> > +/* Instruction encoding provided for binutils backwards compatibility. */
> > +#define ASM_WBNOINVD _ASM_BYTES(0xf3,0x0f,0x09)
> > +
> > +/*
> > + * Cheaper version of wbinvd(). Call when caches need to be written back but
> > + * not invalidated.
> > + */
> > +static __always_inline void wbnoinvd(void)
> > +{
> > +	/*
> > +	 * If WBNOINVD is unavailable, fall back to the compatible but
> > +	 * more destructive WBINVD (which still writes the caches back
> > +	 * but also invalidates them).

While poking around the SDM and APM to figure out a decent comment, I realized
this comment is somewhat misleading since WBNOINVD is itself backwards compatible.
I still think it's a good idea to use alternative(), e.g. so that explicitly
disabling WBNOINVD in the event of a hardware issue works as expected.

> > +	 */
> > +	alternative("wbinvd", ASM_WBNOINVD, X86_FEATURE_WBNOINVD);
> >  }
> 
> Would be nice here to use the opportunity and document both WBINVD and 
> WBNOINVD a bit more comprehensively, to point out that WBINVD writes 
> back and flushes the caches (and point out which level of caches this 
> affects typically),

Due to memory encryption technologies, I don't think there is a "typical" behavior
these days.  E.g. I'm pretty sure CPUs that support MKTME or SEV+ invalidate all
caches on a package, but I wouldn't classify that as the typical behavior since
there are likely still a huge number of CPUs in the wild that don't poke into the
lower level caches of other CPUs.

> and to point out that the 'invalidate' part of the WBNOINVD name is a
> misnomer, as it doesn't invalidate anything, it only writes back dirty
> cachelines.

I wouldn't call it a misnomer, the NO part makes it semantically accurate.  I
actually think the mnemonic was well chosen, as it helps capture the relationships
and behaviors of INVD, WBINVD, and WBNOINVD.

How about this?

diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
index 6266d6b9e0b8..f2240c4ac0ea 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -115,9 +115,39 @@ static inline void wrpkru(u32 pkru)
 }
 #endif
 
+/*
+ * Write back all modified lines in all levels of cache associated with this
+ * logical processor to main memory, and then invalidate all caches.  Depending
+ * on the micro-architecture, WBINVD (and WBNOINVD below) may or may not affect
+ * lower level caches associated with another logical processor that shares any
+ * level of this processor’s cache hierarchy.
+ *
+ * Note, AMD CPUs enumerate the behavior or WB{NO}{INVD} with respect to other
+ * logical, non-originating processors in CPUID 0x8000001D.EAX[N:0].
+ */
 static __always_inline void wbinvd(void)
 {
-       asm volatile("wbinvd": : :"memory");
+       asm volatile("wbinvd" : : : "memory");
+}
+
+/* Instruction encoding provided for binutils backwards compatibility. */
+#define ASM_WBNOINVD _ASM_BYTES(0xf3,0x0f,0x09)
+
+/*
+ * Write back all modified lines in all levels of cache associated with this
+ * logical processor to main memory, but do NOT explicitly invalidate caches,
+ * i.e. leave all/most cache lines in the hierarchy in non-modified state.
+ */
+static __always_inline void wbnoinvd(void)
+{
+       /*
+        * Explicitly encode WBINVD if X86_FEATURE_WBNOINVD is unavailable even
+        * though WBNOINVD is backwards compatible (it's simply WBINVD with an
+        * ignored REP prefix), to guarantee that WBNOINVD isn't used if it
+        * needs to be avoided for any reason.  For all supported usage in the
+        * kernel, WBINVD is functionally a superset of WBNOINVD.
+        */
+       alternative("wbinvd", ASM_WBNOINVD, X86_FEATURE_WBNOINVD);
 }
 
 static inline unsigned long __read_cr4(void)


More information about the dri-devel mailing list