[Intel-gfx] [PATCH 2/2] drm/i915: allow sync points within batches
Jesse Barnes
jbarnes at virtuousgeek.org
Wed Sep 3 21:01:49 CEST 2014
On Wed, 3 Sep 2014 17:08:53 +0100
Chris Wilson <chris at chris-wilson.co.uk> wrote:
> On Wed, Sep 03, 2014 at 08:41:06AM -0700, Jesse Barnes wrote:
> > On Wed, 3 Sep 2014 08:01:55 +0100
> > Chris Wilson <chris at chris-wilson.co.uk> wrote:
> >
> > > These commands are illegal/invalid inside the object, only valid inside
> > > the ring.
> >
> > Hm, we ought to be able to write to no privileged space with
> > STORE_DWORD, but that does mean moving to context specific pages in
> > process space, or at least adding them to our existing scheme.
>
> The per-process context page also doesn't exist generically. I certainly
> hope that userspace can't overwrite the hws! Imagine if we were using
> that for interrupt status reads, or seqno tracking...
Yeah I'm thinking of an additional hws that's per-context and userspace
mappable. It could come in handy for userspace only sync stuff.
>
> > I haven't tried MI_USER_INTERRUPT from a batch, if we can't do it from
> > a non-privileged batch that nixes one of the other neat features we
> > could have (fine grained intra-batch userspace synchronization).
>
> I don't understand how writing the operation into the batch is
> beneficial vs writing into the ring, unless you instended to use
> something more fine grained than the batch seqno. You want to get
> interrupts from inside batches? Rather than continue the existing scheme
> of splitting up batches between fences?
Yeah, the whole idea here was to avoid flushing batches in order to
emit fences, both to avoid overhead and give userspace more rope.
>
> I definitely think we should think twice before allowing userspace to
> arbitrarily generate interrupts.
>
> > > > + return 0;
> > > > +}
> > > >
> > > > static int
> > > > relocate_entry_cpu(struct drm_i915_gem_object *obj,
> > > > @@ -349,7 +411,8 @@ relocate_entry_gtt(struct drm_i915_gem_object *obj,
> > > > static int
> > > > i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
> > > > struct eb_vmas *eb,
> > > > - struct drm_i915_gem_relocation_entry *reloc)
> > > > + struct drm_i915_gem_relocation_entry *reloc,
> > > > + struct intel_context *ctx)
> > >
> > > Hmm. That's a nuisance. But no, you only use it to automatically create
> > > a fence not to patch the batch, so you can just use an object-flag.
> > >
> > > This fits neatly into requests.
> >
> > Most definitely. What do you think of the potential upside in the DDX
> > for this, assuming we get dword writes from batches working?
>
> Negative. You now have relocation overhead, you still need to split
> batches to keep the gpu busy and do ring switches, and context switching
> between clients, so I don't feel a need for fences from inside a batch.
>
> Getting seqno and a hws in the client would be nice, but if it continues
> to require kernel polling, no thanks, I'll just still to approximately
> tracking the active state of surfaces with the heavier accurate queries
> sparingly.
>
> About the only thing I could see as being useful is that it would allow
> you to reuse a batch buffer multiple times, rather than overallocate a
> whole page and keep a pool of such pages.
>
> I am missing something?
No I think that's about right. The need for reloc processing is a
definite downside to this approach, but that could be solved with a new
interface, or by just allowing userspace to map/manage a hws. The
downside there is that the resulting fences wouldn't be shareable. But
requiring a flush for that is probably fine.
--
Jesse Barnes, Intel Open Source Technology Center
More information about the Intel-gfx
mailing list