[Intel-gfx] [PATCH i-g-t v4] benchmarks/gem_wsim: Command submission workload simulator

Thu Apr 20 15:11:28 UTC 2017

On Thu, Apr 20, 2017 at 03:34:56PM +0100, Tvrtko Ursulin wrote:
> 
> On 20/04/2017 15:23, Chris Wilson wrote:
> >On Thu, Apr 20, 2017 at 01:29:11PM +0100, Tvrtko Ursulin wrote:
> >>+static void
> >>+alloc_step_batch(struct workload *wrk, struct w_step *w, struct w_step_eb *b,
> >>+		 enum intel_engine_id engine, unsigned int flags)
> >>+{
> >>+	unsigned int bb_i, j = 0;
> >>+
> >>+	b->obj[j].handle = gem_create(fd, 4096);
> >>+	b->obj[j].flags = EXEC_OBJECT_WRITE;
> >>+	j++;
> >>+
> >>+	if (flags & SEQNO) {
> >>+		b->obj[j].handle = wrk->status_page_handle;
> >>+		j++;
> >>+	}
> >>+
> >>+	bb_i = j++;
> >>+	b->bb_sz = get_bb_sz(w->duration.max);
> >>+	b->bb_handle = b->obj[bb_i].handle = gem_create(fd, b->bb_sz);
> >>+	terminate_bb(w, b, engine, flags);
> >>+
> >>+	igt_assert(w->dependency <= 0);
> >>+	if (w->dependency) {
> >>+		int dep_idx = w->idx + w->dependency;
> >>+
> >>+		igt_assert(dep_idx >= 0 && dep_idx < wrk->nr_steps);
> >>+		igt_assert(wrk->steps[dep_idx].type == BATCH);
> >>+
> >>+		b->obj[j].handle = b->obj[bb_i].handle;
> >>+		bb_i = j;
> >>+		b->obj[j - 1].handle = wrk->steps[dep_idx].b[0].obj[0].handle;
> >>+		j++;
> >>+
> >>+		if (wrk->steps[dep_idx].b[1].obj[0].handle) {
> >>+			b->obj[j].handle = b->obj[bb_i].handle;
> >>+			bb_i = j;
> >>+			b->obj[j - 1].handle =
> >>+					wrk->steps[dep_idx].b[1].obj[0].handle;
> >>+			j++;
> >>+		}
> >>+	}
> >>+
> >>+	if (flags & SEQNO) {
> >>+		b->reloc.presumed_offset = -1;
> >
> >So as I understand it, you are caching the execbuf/obj/reloc for the
> >workload and then may reissue later with different seqno on different
> >rings? In which case we have a problem as the kernel will write back the >
> >
> >updated offsets to b->reloc.presumed_offset and b->obj[].offset and in
> >future passes they will match and the seqno write will go into the wrong
> >slot (if it swaps rings).
> >
> >You either want to reset presumed_offset=-1 each time, or better for all
> >concerned write the correct address alongside the seqno (which also
> >enables NORELOC).
> >
> >Delta incoming.
> 
> Only the seqno changes, but each engine has its own eb/obj/reloc. So
> I think there is no problem. Or is there still?

bb_handle is per engine as well. Ugh. No, that seems like you are
self-consistent, you just need to remove the -1 and your code is NORELOC
correct.

I wouldn't go so far as having entirely separate batches for each engine
though :)
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre