[igt-dev] [PATCH i-g-t v3 2/2] tests/i915/perf_pmu: Add second store dword for looping
Kamil Konieczny
kamil.konieczny at linux.intel.com
Wed Jul 20 12:08:08 UTC 2022
Hi Zbigniew,
On 2022-07-20 at 13:43:54 +0200, Zbigniew Kempczyński wrote:
> On Wed, Jul 20, 2022 at 01:20:55PM +0200, Kamil Konieczny wrote:
> > Hi Zbigniew,
> >
> > On 2022-07-19 at 13:51:41 +0200, Zbigniew Kempczyński wrote:
> > > On Broadwell/Braswell we observe semaphore wait isn't starting sometimes.
> > > It's hard to say what's the main reason of this behavior - it happens
> > > rare (each 2-3 months). To find out is this delayed write lets add
> > > additional store dword for loop condition. Add also some assertions
> > > to catch unexpected sw/hw behavior.
> > >
> > > Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski at intel.com>
> > > ---
> > >
> > > v2: Add timeout to break the loop
> > > ---
> > > tests/i915/perf_pmu.c | 42 +++++++++++++++++++++++++++++++++---------
> > > 1 file changed, 33 insertions(+), 9 deletions(-)
> > >
> > > diff --git a/tests/i915/perf_pmu.c b/tests/i915/perf_pmu.c
> > > index 39e9fc5fef..9b951df125 100644
> > > --- a/tests/i915/perf_pmu.c
> > > +++ b/tests/i915/perf_pmu.c
> > > @@ -812,6 +812,12 @@ create_sema(int gem_fd, uint64_t ahnd,
> > > 0,
> > > 1,
> > >
> > > + /* Separate write for loop */
> > > + MI_STORE_DWORD_IMM,
> > > + 0,
> > > + 0,
> > > + 2,
> > > +
> >
> > imho this should be after semaphore, I see no point adding this
> > right after first write and then waiting on map[1] == 2. Maybe
> > it is worth to add this after semaphore and wait on
> > map[0] == 1 ? Or when one write fails we can run second round
> > with two consecutive writes ?
>
> According to loop comparison (map[1] != 2) we would never unblock semaphore.
> Bug I observe on BDW/BSW lead me to conclusion semaphore read is may be
> from scratch so semaphore wait would never happen. I want to have two
> separate writes, one for semaphore, second for cpu usage. I suspect delayed
> write but we got other cases where this behavior is not observed. So at least
> I want to have more data if same issue will reoccur.
>
> >
> > > /* Wait until the semaphore value is set to 0 [by caller] */
> >
> > I do not see in code "map[0] = 0" ?
>
> *map = 0; __sync_synchronize();
Yes, I missed this, as this happen after
gem_close(gem_fd, obj.handle).
> > > MI_SEMAPHORE_WAIT | MI_SEMAPHORE_POLL | MI_SEMAPHORE_SAD_NEQ_SDD,
> > > 1,
> > > @@ -827,17 +833,22 @@ create_sema(int gem_fd, uint64_t ahnd,
> > > handle = gem_create(gem_fd, 4096);
> > > *poffset = get_offset(ahnd, handle, 4096, 0);
> > >
> > > - memset(reloc, 0, 2 * sizeof(*reloc));
> > > + memset(reloc, 0, 3 * sizeof(*reloc));
> > > reloc[0].target_handle = handle;
> > > reloc[0].offset = 64 + 1 * sizeof(uint32_t);
> > > reloc[1].target_handle = handle;
> > > - reloc[1].offset = 64 + 6 * sizeof(uint32_t);
> > > + reloc[1].offset = 64 + 5 * sizeof(uint32_t);
> > > + reloc[1].delta = 4;
> > > + reloc[2].target_handle = handle;
> > > + reloc[2].offset = 64 + 10 * sizeof(uint32_t);
> > >
> > > if (ahnd) {
> > > cs[1] = *poffset;
> > > cs[2] = *poffset >> 32;
> > > - cs[6] = *poffset;
> > > - cs[7] = *poffset >> 32;
> > > + cs[5] = *poffset + 4;
> > > + cs[6] = (*poffset + 4) >> 32;
> > > + cs[10] = *poffset;
> > > + cs[11] = *poffset >> 32;
> > > }
> > >
> > > gem_write(gem_fd, handle, 64, cs, sizeof(cs));
> > > @@ -856,10 +867,10 @@ __sema_busy(int gem_fd, uint64_t ahnd, int pmu, const intel_ctx_t *ctx,
> > > };
> > > uint64_t total, sema, busy;
> > > uint64_t start[2], val[2];
> > > - struct drm_i915_gem_relocation_entry reloc[2];
> > > + struct drm_i915_gem_relocation_entry reloc[3];
> > > struct drm_i915_gem_exec_object2 obj = {
> > > .handle = create_sema(gem_fd, ahnd, reloc, &obj.offset),
> > > - .relocation_count = !ahnd ? 2 : 0,
> > > + .relocation_count = !ahnd ? 3 : 0,
> > > .relocs_ptr = to_user_pointer(reloc),
> > > .flags = !ahnd ? 0 : EXEC_OBJECT_PINNED,
> > > };
> > > @@ -872,19 +883,32 @@ __sema_busy(int gem_fd, uint64_t ahnd, int pmu, const intel_ctx_t *ctx,
> > > };
> > > igt_spin_t *spin;
> > > uint32_t *map;
> > > + struct timespec tv = {};
> > > + int timeout = 3;
> > >
> > > /* Time spent being busy includes time waiting on semaphores */
> > > igt_assert(busy_pct >= sema_pct);
> > >
> > > gem_quiescent_gpu(gem_fd);
> > >
> > > - map = gem_mmap__device_coherent(gem_fd, obj.handle, 0, 4096, PROT_WRITE);
> > > + map = gem_mmap__device_coherent(gem_fd, obj.handle, 0, 4096, PROT_READ | PROT_WRITE);
> > > gem_execbuf(gem_fd, &eb);
> > > + igt_assert(gem_bo_busy(gem_fd, obj.handle));
> >
> > This assert here prevents your warnings below to happen, maybe
> > just store this as boolean for later ?
>
> Execbuf changed that object is busy, so for normal condition it would
> never assert. If it asserts here that means wait is already completed
> so that means semaphore wait didn't happen.
But it is exactly what you wrote above about reading from scratch
so why not remember it here and dump debugs later ?
> > > spin = igt_spin_new(gem_fd, .ahnd = ahnd, .ctx = ctx, .engine = e->flags);
> > >
> > > - /* Wait until the batch is executed and the semaphore is busy-waiting */
> > > - while (!READ_ONCE(*map) && gem_bo_busy(gem_fd, obj.handle))
> > > + /*
> > > + * Wait until the batch is executed and the semaphore is busy-waiting.
> > > + * Also stop on timeout.
> > > + */
> > > + igt_nsec_elapsed(&tv);
> > > + while (READ_ONCE(map[1]) != 2 && gem_bo_busy(gem_fd, obj.handle) &&
> > > + igt_seconds_elapsed(&tv) < timeout)
> > > ;
> > > + if (igt_seconds_elapsed(&tv) >= timeout)
> > > + igt_warn("timeout: [%u/%u]\n", igt_seconds_elapsed(&tv), timeout);
Hmm, maybe igt_debug here ?
> > > + if (!gem_bo_busy(gem_fd, obj.handle))
> > > + igt_warn("Bo != busy, map[0] = %u, map[1] = %u\n", map[0], map[1]);
This will not warn if it asserts before.
--
Kamil
> > > + igt_assert(map[0] == 1);
> > > igt_assert(gem_bo_busy(gem_fd, obj.handle));
> > > gem_close(gem_fd, obj.handle);
> > >
> > > --
> > > 2.34.1
> > >
More information about the igt-dev
mailing list