[igt-dev] [PATCH i-g-t 1/3] tests/perf_pmu: Tighten busy measurement

Thu Feb 1 16:58:21 UTC 2018

On 01/02/2018 16:39, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-02-01 16:26:58)
>>
>> On 01/02/2018 12:57, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2018-02-01 12:47:44)
>>>> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>>>
>>>> In cases where we manually terminate the busy batch, we always want to
>>>> sample busyness while the batch is running, just before we will
>>>> terminate it, and not the other way around. This way we make the window
>>>> for unwated idleness getting sampled smaller.
>>>>
>>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>>> ---
>>>>    tests/perf_pmu.c | 28 +++++++++++++---------------
>>>>    1 file changed, 13 insertions(+), 15 deletions(-)
>>>>
>>>> diff --git a/tests/perf_pmu.c b/tests/perf_pmu.c
>>>> index 2f7d33414a53..bf16e5e8b1f9 100644
>>>> --- a/tests/perf_pmu.c
>>>> +++ b/tests/perf_pmu.c
>>>> @@ -146,10 +146,9 @@ single(int gem_fd, const struct intel_execution_engine2 *e, bool busy)
>>>>                   spin = NULL;
>>>>    
>>>>           slept = measured_usleep(batch_duration_ns / 1000);
>>>> -       igt_spin_batch_end(spin);
>>>> -
>>>>           val = pmu_read_single(fd);
>>>>    
>>>> +       igt_spin_batch_end(spin);
>>>
>>> But that's the wrong way round as we are measuring busyness, and the
>>> sampling should terminate as soon as the spin-batch ends, before we even
>>> read the PMU sample? For the timer sampler, it's lost in the noise.
>>>
>>> So the idea was to cancel the busyness asap so that the sampler stops
>>> updating before we even have to cross into the kernel for the PMU read.
>>
>> I don't follow on the problem statement. This is how code used to looks
>> in many places:
>>
>>          slept = measured_usleep(batch_duration_ns / 1000);
>>          igt_spin_batch_end(spin);
>>
>>          pmu_read_multi(fd[0], num_engines, val);
>>
>> Problem here is there is indeterministic time gap, depending on test
>> execution speed, between requesting the batch to end to reading the
>> counter. This can add a random amount of idleness to the read value.
> 
> But we are not measuring idleness, but busyness. The batch will end a

Yep, and that's why I removed some idleness from the busyness! :) Plus 
removed the time required to signal batch end from it as well.

> few tens of nanoseconds after the write hits memory. The desire is that
> time is zero so that the sleep exactly corresponds with the interval the
> batch was spinning (busy).
> 
>> And another indeterminism in how long it takes for the batch end request
>> to get picked up by the GPU. If that is slower in some cases, the
>> counter will drift from the expected value toward other direction,
>> overestimating busyness relative to sleep duration.
>>
>> Attempted improvement was simply to reverse the last two lines, so we
>> read the counter when we know it is busy (after the sleep), and then
>> request batch termination.
>>
>> This only leaves the scheduling delay between end of sleep and counter
>> read, which is smaller than end of sleep - batch end - counter read.
>>
>> These tests are not testing for edge conditions, just that the busy
>> engines are reported as busy, in various combinations, so that sounded
>> like a reasonable change.
>>
>> I hope I did not get confused here, it wouldn't be the first time in
>> these tests...
> 
> Aiui, the PMU measure busyness which is the duration of the spin-batch
> (or first enabling of the PMU event). The choice is between a context
> switch into the kernel to stop the counter, or a single write to memory
> to stop the batch).
> 
> My bet is the write to memory will turn off the counter within 100ns,
> worst case including the interrupt processing. The PMU event stopping
> I estimate 100ns best case (including the kernel context switch, and
> don't ask how KPTI perturbs that switch as I don't know off hand how
> much worse it will get).

There is not PMU event stopping in the changed bits, so not being under 
measure either before or after.

> Now, the write to memory is asynchronous to the PMU event stopping. So
> if you write first, whichever is processed first (the kernel context
> switch + event stop, or the CS interrupt) causes the busy counter to
> cease. So best of both worlds?

I am just moving the two sample read-outs closer together in time. Two 
samples are measure of how long we slept, and measure of PMU busyness. 
There should be as little time as possible between the two readouts. 
Ending the spinner I don't see how it belongs in this sandwich, or 
anything else really.

Regards,

Tvrtko