[Intel-gfx] [PATCH 1/5] drm/i915: Mark up unlocked update of i915_request.hwsp_seqno
Chris Wilson
chris at chris-wilson.co.uk
Mon Mar 9 16:04:27 UTC 2020
Quoting Mika Kuoppala (2020-03-09 15:21:31)
> Chris Wilson <chris at chris-wilson.co.uk> writes:
>
> > Quoting Mika Kuoppala (2020-03-09 14:03:01)
> >> Chris Wilson <chris at chris-wilson.co.uk> writes:
> >>
> >> > During i915_request_retire() we decouple the i915_request.hwsp_seqno
> >> > from the intel_timeline so that it may be freed before the request is
> >> > released. However, we need to warn the compiler that the pointer may
> >> > update under its nose.
> >> >
> >> > [ 171.438899] BUG: KCSAN: data-race in i915_request_await_dma_fence [i915] / i915_request_retire [i915]
> >> > [ 171.438920]
> >> > [ 171.438932] write to 0xffff8881e7e28ce0 of 8 bytes by task 148 on cpu 2:
> >> > [ 171.439174] i915_request_retire+0x1ea/0x660 [i915]
> >> > [ 171.439408] retire_requests+0x7a/0xd0 [i915]
> >> > [ 171.439640] engine_retire+0xa1/0xe0 [i915]
> >> > [ 171.439657] process_one_work+0x3b1/0x690
> >> > [ 171.439671] worker_thread+0x80/0x670
> >> > [ 171.439685] kthread+0x19a/0x1e0
> >> > [ 171.439701] ret_from_fork+0x1f/0x30
> >> > [ 171.439721]
> >> > [ 171.439739] read to 0xffff8881e7e28ce0 of 8 bytes by task 696 on cpu 1:
> >> > [ 171.439990] i915_request_await_dma_fence+0x162/0x520 [i915]
> >> > [ 171.440230] i915_request_await_object+0x2fe/0x470 [i915]
> >> > [ 171.440467] i915_gem_do_execbuffer+0x45dc/0x4c20 [i915]
> >> > [ 171.440704] i915_gem_execbuffer2_ioctl+0x2c3/0x580 [i915]
> >> > [ 171.440722] drm_ioctl_kernel+0xe4/0x120
> >> > [ 171.440736] drm_ioctl+0x297/0x4c7
> >> > [ 171.440750] ksys_ioctl+0x89/0xb0
> >> > [ 171.440766] __x64_sys_ioctl+0x42/0x60
> >> > [ 171.440788] do_syscall_64+0x6e/0x2c0
> >> > [ 171.440802] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> >> >
> >> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> >> > Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
> >> > ---
> >> > drivers/gpu/drm/i915/i915_request.h | 7 +++++--
> >> > 1 file changed, 5 insertions(+), 2 deletions(-)
> >> >
> >> > diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
> >> > index d4bae16b4785..6020d5b2a3df 100644
> >> > --- a/drivers/gpu/drm/i915/i915_request.h
> >> > +++ b/drivers/gpu/drm/i915/i915_request.h
> >> > @@ -396,7 +396,9 @@ static inline bool i915_seqno_passed(u32 seq1, u32 seq2)
> >> >
> >> > static inline u32 __hwsp_seqno(const struct i915_request *rq)
> >> > {
> >> > - return READ_ONCE(*rq->hwsp_seqno);
> >> > + const u32 *hwsp = READ_ONCE(rq->hwsp_seqno);
> >> > +
> >> > + return READ_ONCE(*hwsp);
> >>
> >> This is good enough for decouple. But good enough for hardware
> >> might be different thing.
> >>
> >> I am paranoid enough to wanting an rmb(), before the final
> >> read once.
> >
> > What? [That pointer is nothing to do with HW; it's a pointer to a
> > pointer to HW.]
>
> But you do read the value through the pointer to hardware.
>
> CPU:
> rmb(); READ_ONCE(*hwsp);
>
> GPU:
> WRITE_ONCE(*hwsp, seqno), wmb(), interrupt -> cpu.
>
> Thus on waking up, you would be guaranteed to see the
> value gpu intended upon.
The bspec gives us the guarantee that we see the correct value as the
GPU takes care of the cacheline invalidation on writing. We haven't had
reason not to believe that yet, all our issues so far have been the
arrival of the interrupt vs update of the seqno. (Well the whole design
of the request is that we don't really care how long it takes, just that
once a request is complete it stays completed.)
-Chris
More information about the Intel-gfx
mailing list