[Intel-gfx] [PATCH i-g-t v3] benchmarks/gem_wsim: Command submission workload simulator
Chris Wilson
chris at chris-wilson.co.uk
Fri Apr 7 09:51:04 UTC 2017
On Fri, Apr 07, 2017 at 09:53:05AM +0100, Tvrtko Ursulin wrote:
>
> On 06/04/2017 09:55, Chris Wilson wrote:
> >On Thu, Apr 06, 2017 at 09:18:36AM +0100, Tvrtko Ursulin wrote:
>
> [snip]
[snip]
> >>>>+ if (swap_vcs && engine == VCS1)
> >>>>+ engine = VCS2;
> >>>>+ else if (swap_vcs && engine == VCS2)
> >>>>+ engine = VCS1;
> >>>>+ w->eb.flags = eb_engine_map[engine];
> >>>>+ w->eb.flags |= I915_EXEC_HANDLE_LUT;
> >>>>+ if (!seqnos)
> >>>>+ w->eb.flags |= I915_EXEC_NO_RELOC;
> >>>
> >>>Doesn't look too hard to get the relocation right. Forcing relocations
> >>>between batches is probably a good one to check (just to say don't do
> >>>that)
> >>
> >>I am not following here? You are saying don't do relocations at all?
> >>How do I make sure things stay fixed and even how to find out where
> >>they are in the first pass?
> >
> >Depending on the workload, it may be informative to also do comparisons
> >between NORELOC and always RELOC. Personally I would make sure we were
> >using NORELOC as this should be a simulator/example.
>
> How do I use NORELOC? I mean, I have to know where to objects will
> be pinned, or be able to pin them first and know they will remain
> put. What am I not understanding here?
It will be assigned an address on first execution. Can I quote the spiel
I wrote for i915_gem_execbuffer.c and see if that answers how to use
NORELOC:
* Reserving resources for the execbuf is the most complicated phase. We
* neither want to have to migrate the object in the address space, nor do
* we want to have to update any relocations pointing to this object. Ideally,
* we want to leave the object where it is and for all the existing relocations
* to match. If the object is given a new address, or if userspace thinks the
* object is elsewhere, we have to parse all the relocation entries and update
* the addresses. Userspace can set the I915_EXEC_NORELOC flag to hint that
* all the target addresses in all of its objects match the value in the
* relocation entries and that they all match the presumed offsets given by the
* list of execbuffer objects. Using this knowledge, we know that if we haven't
* moved any buffers, all the relocation entries are valid and we can skip
* the update. (If userspace is wrong, the likely outcome is an impromptu GPU
* hang.) The requirement for using I915_EXEC_NO_RELOC are:
*
* The addresses written in the objects must match the corresponding
* reloc.presumed_offset which in turn must match the corresponding
* execobject.offset.
*
* Any render targets written to in the batch must be flagged with
* EXEC_OBJECT_WRITE.
*
* To avoid stalling, execobject.offset should match the current
* address of that object within the active context.
*
Does that make sense? How questions remain unanswered?
Hmm, I usually sum it up as
batch[reloc.offset] == reloc.presumed_offset + reloc.delta;
and
execobj.offset == reloc.presumed_offset
must be true at the time of execbuf. Note that upon relocation,
batch[reloc.offset], reloc.presumed_offset and execobj.offset are
updated. This is important to remember if you are prerecording the
reloc/execobj arrays, and not feeding back the results of execbuf
between phases.
> But in general is this correctly implementing your idea for queue
> depth estimation?
>From my rough checklist:
* writes engine->next_seqno++ after each op (in this case end of batch)
* qlen[engine] = engine->next_seqno - *engine->current_seqno;
Design looks right. Implementation requires checking... I'll be back.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
More information about the Intel-gfx
mailing list