[Intel-gfx] [PATCH] drm/i915/execlists: Poison the CSB after use
Chris Wilson
chris at chris-wilson.co.uk
Tue Oct 30 09:37:15 UTC 2018
Quoting Mika Kuoppala (2018-10-30 09:31:56)
> Chris Wilson <chris at chris-wilson.co.uk> writes:
>
> > After reading the event status from the CSB, write back 0 (an invalid
> > value) so we can detect if the HW should signal a new event without
> > writing the event in the future.
> >
> > References: https://bugs.freedesktop.org/show_bug.cgi?id=108315
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
> > ---
> > drivers/gpu/drm/i915/intel_lrc.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> > index 22b57b8926fc..126efe20d2d6 100644
> > --- a/drivers/gpu/drm/i915/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/intel_lrc.c
> > @@ -910,6 +910,9 @@ static void process_csb(struct intel_engine_cs *engine)
> > execlists->active);
> >
> > status = buf[2 * head];
> > + GEM_BUG_ON(!status);
>
> Assuming we still have a timing issue in here, how about
> we poll a little until status != 0 and then continue with warning?
If there's any race condition here, we definitely do not want to paper
over it.
> We could recover by finding the 'bit late' status, instead of
> oopsing out.
Oopsing out tells us where the problem is very concisely.
> > + GEM_DEBUG_EXEC(WRITE_ONCE(*(u32 *)(buf + 2 * head), 0));
>
> What I am afraid here is that we change the timing and cache dynamics
> for our debug builds so that we bury the pesky thing.
That too is a result.
> Perhaps I am wandering too far but lets consider for the csb loop:
>
> read head,tail;
> rmb();
>
> for_each_csb() {
> 64 bit read
> 64 bit write to zero it, unconditionally
> act_on_it()
> }
>
> Too heavy?
Too papery - shouts that we don't know what we or the hw is doing. We
want to pretend that we know what we are doing at least.
-Chris
More information about the Intel-gfx
mailing list