[Intel-gfx] [PATCH v3] drm/i915/pmu: Reconstruct active state on starting busy-stats
Tvrtko Ursulin
tvrtko.ursulin at linux.intel.com
Fri Jan 12 13:19:30 UTC 2018
On 12/01/2018 13:03, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-01-12 11:43:11)
>>
>> On 12/01/2018 10:35, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2018-01-12 10:30:26)
>>>>
>>>>
>>>> On 12/01/2018 09:51, Chris Wilson wrote:
>>>>> Quoting Tvrtko Ursulin (2018-01-12 09:40:40)
>>>>>> So submit side doesn't work in either case, unless I am missing
>>>>>> something. Would need the pair of port manipulation and context_in to be
>>>>>> atomic.
>>>>>
>>>>> Sure, there is a small window with the result that we either never turn
>>>>> off the stats, or turn them off one too early (and then recover if the
>>>>> my understanding of the underflow protection holds). The same problem as
>>>>> existed before the reconstruction, except now the window is much
>>>>> smaller. I'm not that scared by this (it should not explode, nor result
>>>>> in hopelessly wrong stats) so it can wait until stats enabling doesn't
>>>>> need to peek at execlists. I think you will need to postpone enabling
>>>>> until the next context-switch if we wanted to avoid the atomics; except
>>>>> that poses a problem with the igt expecting busy-start to be accurate. A
>>>>> dilemma for later.
>>>>
>>>> My analysis was partially incorrect, yes, there is underflow protection
>>>> already.
>>>>
>>>> But I don't see that there is a race window where we end up with
>>>> permanent 100% busyness before the reconstruction patch. Where do you
>>>> see that?
>>>>
>>>> The worst I see without the reconstruction is to miss the accounting of
>>>> the batch currently in progress when stats get enabled. Which is a much
>>>> less serious, totally ignorable event.
>>>
>>> One is observable via pmu, the other not. If we fail to turn off
>>> busy-stats accounting, nothing is lost except for a few wasted cycles.
>>> Except on the next pmu, it starts from the previous cs instead of the
>>> enabling -- but that is a problem that also exists with the second user.
>>
>> I don't follow, I am talking about permanent 100%. Let me copy what I
>> wrote earlier:
>>
>> port0 context complete
>> context_out - not enabled, no-op
>> stats enable - port0 busy, active = 1
>> port0 clear
>> submit
>> context_in - active = 2 !!! BAD
>> port0 set
>>
>> That leads to permanent 100% until busy stats are disabled. I think that
>> is hugely less desirable than just failing to account for the currently
>> running batch, which was the case before reconstruction on enable, and
>> is 99.9% only a problem for IGTs.
>>
>> Do you think there is a flaw in this analysis or something else?
>
> No, I don't think it's a huge problem, an improbable race for which the
> quick-and-dirty cure may worse than the disease:
>
> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> index 25360ce0353f..62d9ee9d45a6 100644
> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> @@ -1980,16 +1980,22 @@ intel_engine_lookup_user(struct drm_i915_private *i915, u8 class, u8 instance)
> */
> int intel_enable_engine_stats(struct intel_engine_cs *engine)
> {
> + struct intel_engine_execlists *execlists = &engine->execlists;
> unsigned long flags;
> + int err = 0;
>
> if (!intel_engine_supports_stats(engine))
> return -ENODEV;
>
> + tasklet_disable(&execlists->tasklet);
> spin_lock_irqsave(&engine->stats.lock, flags);
> - if (engine->stats.enabled == ~0)
> - goto busy;
> +
> + if (unlikely(engine->stats.enabled == ~0)) {
> + err = -EBUSY;
> + goto unlock;
> + }
> +
> if (engine->stats.enabled++ == 0) {
> - struct intel_engine_execlists *execlists = &engine->execlists;
> const struct execlist_port *port = execlists->port;
> unsigned int num_ports = execlists_num_ports(execlists);
>
> @@ -2004,14 +2010,12 @@ int intel_enable_engine_stats(struct intel_engine_cs *engine)
> if (engine->stats.active)
> engine->stats.start = engine->stats.enabled_at;
> }
> - spin_unlock_irqrestore(&engine->stats.lock, flags);
>
> - return 0;
> -
> -busy:
> +unlock:
> spin_unlock_irqrestore(&engine->stats.lock, flags);
> + tasklet_enable(&execlists->tasklet);
>
> - return -EBUSY;
> + return err;
> }
>
> static ktime_t __intel_engine_get_busy_time(struct intel_engine_cs *engine)
>
> using the tasklet control as a spinlock. Whereas the previous race did
> not take much effort to observe.
By the previous race you are referring to the missed currently running
batch, or the permanent 100% with active state reconstruction?
You are against reverting the reconstruction? Even if it is unlikely, I
see it as very severe, while the missed current batch is more likely but
not at all severe. So I would be really for the revert. Unfortunately I
did not spot this issue during review. :(
Regards,
Tvrtko
More information about the Intel-gfx
mailing list