[Intel-gfx] [PATCH v3] drm/i915/pmu: Reconstruct active state on starting busy-stats
Chris Wilson
chris at chris-wilson.co.uk
Fri Jan 12 13:03:06 UTC 2018
Quoting Tvrtko Ursulin (2018-01-12 11:43:11)
>
> On 12/01/2018 10:35, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2018-01-12 10:30:26)
> >>
> >>
> >> On 12/01/2018 09:51, Chris Wilson wrote:
> >>> Quoting Tvrtko Ursulin (2018-01-12 09:40:40)
> >>>> So submit side doesn't work in either case, unless I am missing
> >>>> something. Would need the pair of port manipulation and context_in to be
> >>>> atomic.
> >>>
> >>> Sure, there is a small window with the result that we either never turn
> >>> off the stats, or turn them off one too early (and then recover if the
> >>> my understanding of the underflow protection holds). The same problem as
> >>> existed before the reconstruction, except now the window is much
> >>> smaller. I'm not that scared by this (it should not explode, nor result
> >>> in hopelessly wrong stats) so it can wait until stats enabling doesn't
> >>> need to peek at execlists. I think you will need to postpone enabling
> >>> until the next context-switch if we wanted to avoid the atomics; except
> >>> that poses a problem with the igt expecting busy-start to be accurate. A
> >>> dilemma for later.
> >>
> >> My analysis was partially incorrect, yes, there is underflow protection
> >> already.
> >>
> >> But I don't see that there is a race window where we end up with
> >> permanent 100% busyness before the reconstruction patch. Where do you
> >> see that?
> >>
> >> The worst I see without the reconstruction is to miss the accounting of
> >> the batch currently in progress when stats get enabled. Which is a much
> >> less serious, totally ignorable event.
> >
> > One is observable via pmu, the other not. If we fail to turn off
> > busy-stats accounting, nothing is lost except for a few wasted cycles.
> > Except on the next pmu, it starts from the previous cs instead of the
> > enabling -- but that is a problem that also exists with the second user.
>
> I don't follow, I am talking about permanent 100%. Let me copy what I
> wrote earlier:
>
> port0 context complete
> context_out - not enabled, no-op
> stats enable - port0 busy, active = 1
> port0 clear
> submit
> context_in - active = 2 !!! BAD
> port0 set
>
> That leads to permanent 100% until busy stats are disabled. I think that
> is hugely less desirable than just failing to account for the currently
> running batch, which was the case before reconstruction on enable, and
> is 99.9% only a problem for IGTs.
>
> Do you think there is a flaw in this analysis or something else?
No, I don't think it's a huge problem, an improbable race for which the
quick-and-dirty cure may worse than the disease:
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 25360ce0353f..62d9ee9d45a6 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -1980,16 +1980,22 @@ intel_engine_lookup_user(struct drm_i915_private *i915, u8 class, u8 instance)
*/
int intel_enable_engine_stats(struct intel_engine_cs *engine)
{
+ struct intel_engine_execlists *execlists = &engine->execlists;
unsigned long flags;
+ int err = 0;
if (!intel_engine_supports_stats(engine))
return -ENODEV;
+ tasklet_disable(&execlists->tasklet);
spin_lock_irqsave(&engine->stats.lock, flags);
- if (engine->stats.enabled == ~0)
- goto busy;
+
+ if (unlikely(engine->stats.enabled == ~0)) {
+ err = -EBUSY;
+ goto unlock;
+ }
+
if (engine->stats.enabled++ == 0) {
- struct intel_engine_execlists *execlists = &engine->execlists;
const struct execlist_port *port = execlists->port;
unsigned int num_ports = execlists_num_ports(execlists);
@@ -2004,14 +2010,12 @@ int intel_enable_engine_stats(struct intel_engine_cs *engine)
if (engine->stats.active)
engine->stats.start = engine->stats.enabled_at;
}
- spin_unlock_irqrestore(&engine->stats.lock, flags);
- return 0;
-
-busy:
+unlock:
spin_unlock_irqrestore(&engine->stats.lock, flags);
+ tasklet_enable(&execlists->tasklet);
- return -EBUSY;
+ return err;
}
static ktime_t __intel_engine_get_busy_time(struct intel_engine_cs *engine)
using the tasklet control as a spinlock. Whereas the previous race did
not take much effort to observe.
-Chris
More information about the Intel-gfx
mailing list