[PATCH i-g-t 2/3] lib/xe_spin: Fix xe_spin_wait_started()

Thu Aug 1 05:42:54 UTC 2024

On Wed, Jul 31, 2024 at 08:06:52AM +0200, Zbigniew Kempczyński wrote:
> On Tue, Jul 30, 2024 at 08:35:03PM -0700, Lucas De Marchi wrote:
> > Use READ_ONCE() to guarantee we are actually reading the memory written
> > by the GPU and compiler can't optimize it away.
> > 
> > Signed-off-by: Lucas De Marchi <lucas.demarchi at intel.com>
> > ---
> >  lib/xe/xe_spin.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/lib/xe/xe_spin.c b/lib/xe/xe_spin.c
> > index 3adacc3a8..43a7a43c2 100644
> > --- a/lib/xe/xe_spin.c
> > +++ b/lib/xe/xe_spin.c
> > @@ -166,7 +166,7 @@ void xe_spin_init(struct xe_spin *spin, struct xe_spin_opts *opts)
> >   */
> >  bool xe_spin_started(struct xe_spin *spin)
> >  {
> > -	return spin->start != 0;
> > +	return READ_ONCE(spin->start) != 0;
> >  }
> >  
> >  /**
> > -- 
> > 2.43.0
> > 
> 
> Hmm, function is exported and would expect compiler will always
> read the memory before return. This is part of .so and I don't
> think this will be inlined to the caller. Where do you observe
> gpu memory write is not reflected to the cpu?
> 
> --
> Zbigniew
> 

I wondered about local inlining the function or optimizing which
will prevent from looping the read (endless loop after single read)
and I got this when code is optimized but not position independent.
I mean -O2 produces:

xe_spin_wait_started:
	cmp     DWORD PTR [rdi], 1
	jne     .L6
        ret
.L6:
        jmp     .L6

Only -fpic prevents from this optimization.

As Jonathan gave you r-b I'm not going to block this series,
it is just not necessary imo in this particular case.

--
Zbigniew