[igt-dev] [PATCH i-g-t 1/6] test/perf: Drop caches when closing perf stream

Umesh Nerlige Ramappa umesh.nerlige.ramappa at intel.com
Wed Mar 4 17:51:56 UTC 2020


On Wed, Mar 04, 2020 at 10:45:55AM +0200, Lionel Landwerlin wrote:
>On 04/03/2020 00:57, Umesh Nerlige Ramappa wrote:
>>On Tue, Mar 03, 2020 at 02:38:08PM -0800, Umesh Nerlige Ramappa wrote:
>>>Running ./build/tests/perf will run all the perf subtests in a sequence.
>>>When running tests in a sequence, subsequent tests may not run with a
>>>clean slate. For resources that are lazily released, drop caches in
>>>__perf_close.
>>
>>Hi Lionel, Chris,
>>
>>I notice an issue on TGL when running the entire suite of perf 
>>tests.  In my setup, the polling test was failing with invalid 
>>reports being seen in the beginning of the OA buffer. This issue is 
>>seen more prominently with the newly added subtests which call 
>>perf_open and perf_close a couple of times (say 
>>blocking-with-interrupt).
>>
>>What I see in some runs is that the second test would result in a 
>>bunch of unlanded reports in the beginning of the OA buffer. 
>>Assuming that we are already waiting for the NOA config with a 
>>noa_wait bo, I tried to look into this further.
>>
>>free_oa_buffer is called to free the oa_buffer bo and this work is 
>>deferred by the driver. If a test is called before this free 
>>completes, we see the issue. Just to test out this theory, if I 
>>comment out the free_oa_buffer entirely, I see that the tests pass 
>>without any issues since new gtt memory is being allocated each 
>>time.
>>
>>I guess the deferred free and the new allocation of the OA buffer 
>>for subsequent test has something missing. Maybe TLBs not being 
>>dropped? I imagine the OA unit might write valid reports somewhere 
>>based on what it sees in the TLBs and cpu is looking for them 
>>elsewhere (until the free completes). Just a theory though. Let me 
>>know what you think.
>>
>>For now igt_drop_caches_set(DROP_FREED) is what is helping and hence 
>>this patch.
>
>
>Hey Umesh,
>
>
>I guess this could be fixed by this commit :
>
>
>commit 4b4e973d5eb89244b67d3223b60f752d0479f253
>Author: Chris Wilson <chris at chris-wilson.co.uk>
>Date:   Mon Mar 2 08:57:57 2020 +0000
>
>    drm/i915/perf: Reintroduce wait on OA configuration completion
>
>If you can give this commit a try or rebase on drm-tip it would be 
>great to confirm.

I thought this commit was ensuring that the noa_wait is executed 
completely before we enable the OA buffer captures. That still does not 
explain why the issue goes away for me when I comment out 
free_oa_buffer.

Thanks,
Umesh

>
>Otherwise we might need more digging to figure what's going on.
>
>
>Thanks,
>
>
>-Lionel
>
>
>>
>>Thanks,
>>Umesh
>>
>>>
>>>Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa at intel.com>
>>>---
>>>tests/perf.c | 7 ++++++-
>>>1 file changed, 6 insertions(+), 1 deletion(-)
>>>
>>>diff --git a/tests/perf.c b/tests/perf.c
>>>index 5e818030..189c6aa1 100644
>>>--- a/tests/perf.c
>>>+++ b/tests/perf.c
>>>@@ -244,6 +244,12 @@ __perf_close(int fd)
>>>        close(pm_fd);
>>>        pm_fd = -1;
>>>    }
>>>+
>>>+    /* When running tests in a sequence, subsequent tests may not 
>>>run with a
>>>+     * clean slate. For resources that are lazily released, 
>>>cleanup here.
>>>+     */
>>>+    if (drm_fd >= 0 && !getgid() && !getuid())
>>>+        gem_quiescent_gpu(drm_fd);
>>>}
>>>
>>>static int
>>>@@ -3993,7 +3999,6 @@ test_rc6_disable(void)
>>>    igt_assert_eq(n_events_end - n_events_start, 0);
>>>
>>>    __perf_close(stream_fd);
>>>-    gem_quiescent_gpu(drm_fd);
>>>
>>>    n_events_start = rc6_residency_ms();
>>>    nanosleep(&(struct timespec){ .tv_sec = 1, .tv_nsec = 0 }, NULL);
>>>-- 
>>>2.20.1
>>>
>>>_______________________________________________
>>>igt-dev mailing list
>>>igt-dev at lists.freedesktop.org
>>>https://lists.freedesktop.org/mailman/listinfo/igt-dev
>
>


More information about the igt-dev mailing list