[PATCH i-g-t 2/3] lib/xe_spin: Fix xe_spin_wait_started()
Lucas De Marchi
lucas.demarchi at intel.com
Thu Aug 8 06:00:13 UTC 2024
On Thu, Aug 01, 2024 at 07:42:54AM GMT, Zbigniew Kempczyński wrote:
>On Wed, Jul 31, 2024 at 08:06:52AM +0200, Zbigniew Kempczyński wrote:
>> On Tue, Jul 30, 2024 at 08:35:03PM -0700, Lucas De Marchi wrote:
>> > Use READ_ONCE() to guarantee we are actually reading the memory written
>> > by the GPU and compiler can't optimize it away.
>> >
>> > Signed-off-by: Lucas De Marchi <lucas.demarchi at intel.com>
>> > ---
>> > lib/xe/xe_spin.c | 2 +-
>> > 1 file changed, 1 insertion(+), 1 deletion(-)
>> >
>> > diff --git a/lib/xe/xe_spin.c b/lib/xe/xe_spin.c
>> > index 3adacc3a8..43a7a43c2 100644
>> > --- a/lib/xe/xe_spin.c
>> > +++ b/lib/xe/xe_spin.c
>> > @@ -166,7 +166,7 @@ void xe_spin_init(struct xe_spin *spin, struct xe_spin_opts *opts)
>> > */
>> > bool xe_spin_started(struct xe_spin *spin)
>> > {
>> > - return spin->start != 0;
>> > + return READ_ONCE(spin->start) != 0;
>> > }
>> >
>> > /**
>> > --
>> > 2.43.0
>> >
>>
>> Hmm, function is exported and would expect compiler will always
>> read the memory before return. This is part of .so and I don't
right... I was thinking the whole thing would be statically linked to
the tests. Now I see we actually have a libigt.so
>> think this will be inlined to the caller. Where do you observe
>> gpu memory write is not reflected to the cpu?
I didn't. This was just my braindump while trying to understand the
xe_spin and going out on vacations. Now I'm back...
>>
>> --
>> Zbigniew
>>
>
>I wondered about local inlining the function or optimizing which
>will prevent from looping the read (endless loop after single read)
>and I got this when code is optimized but not position independent.
>I mean -O2 produces:
>
>xe_spin_wait_started:
> cmp DWORD PTR [rdi], 1
> jne .L6
> ret
>.L6:
> jmp .L6
I was actually thinking about xe_spin.o and calls to
xe_spin_wait_started() in the same compilation unit. Could that
function could be inlined and folded in others like
xe_cork_wait_started(). Disassembling the build I get:
void xe_cork_wait_started(struct xe_cork *cork)
{
d1bc0: f3 0f 1e fa endbr64
xe_spin_wait_started(cork->spin);
d1bc4: 48 8b 3f mov (%rdi),%rdi
d1bc7: e9 b4 15 f6 ff jmp 33180 <xe_spin_wait_started at plt>
d1bcc: 0f 1f 40 00 nopl 0x0(%rax)
As you said, apparently no, as the function is exported. And xe_spin_wait_started
does do the right loop.
>
>Only -fpic prevents from this optimization.
>
>As Jonathan gave you r-b I'm not going to block this series,
>it is just not necessary imo in this particular case.
Yeah, I don't think this is needed at all. I will drop these 2
patches and keep only the first one.
Thanks
Lucas De Marchi
>
>--
>Zbigniew
More information about the igt-dev
mailing list