[Intel-gfx] [RFC] drm/i915: Emit to ringbuffer directly
Chris Wilson
chris at chris-wilson.co.uk
Fri Sep 9 13:40:51 UTC 2016
On Fri, Sep 09, 2016 at 09:32:50AM +0100, Tvrtko Ursulin wrote:
>
> On 08/09/16 17:40, Chris Wilson wrote:
> >On Thu, Sep 08, 2016 at 04:12:55PM +0100, Tvrtko Ursulin wrote:
> >>From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> >>
> >>This removes the usage of intel_ring_emit in favour of
> >>directly writing to the ring buffer.
> >
> >I have the same patch! But I called it out, for historical reasons.
>
> Yes I know we talked about it in the past but I did not think you
> will find time to actually write it amongst all the other things.
>
> >Oh, except mine uses out[0]...out[N] because gcc prefers that over
> >*out++ = ...
>
> It copes just fine with the latter here, for example:
>
> *rbuf++ = cmd;
> *rbuf++ = I915_GEM_HWS_SCRATCH_ADDR | MI_FLUSH_DW_USE_GTT;
> *rbuf++ = 0; /* upper addr */
> *rbuf++ = 0; /* value */
>
> Is:
>
> 3e9: 89 10 mov %edx,(%rax)
> 3eb: c7 40 04 04 01 00 00 movl $0x104,0x4(%rax)
> 3f2: c7 40 08 00 00 00 00 movl $0x0,0x8(%rax)
> 3f9: c7 40 0c 00 00 00 00 movl $0x0,0xc(%rax)
Great. Last time we had a conversation about this, and when we looked at
constructing batchbuffers in userpspace, gcc was still generating two
instuctions (*ptr followed by ptr++) rather than emitting the mov to a
fixed offset for that sequence.
> >plus an ealier
> >
> > drivers/gpu/drm/i915/i915_gem_request.c | 26 ++---
> > drivers/gpu/drm/i915/intel_lrc.c | 121 ++++++++---------------
> > drivers/gpu/drm/i915/intel_ringbuffer.c | 168 +++++++++++---------------------
> > drivers/gpu/drm/i915/intel_ringbuffer.h | 10 +-
> > 4 files changed, 112 insertions(+), 213 deletions(-)
> >
> >since I wanted parts of it for emitting timelines.
>
> Ok what do you want to do?
I have plans to use that particular patch soon, but updating
intel_ring_begin() itself is a long way down my list. Given that you have
a patch ready, let's keep going. I'm just curious as to what I did
differently to trim off the extra lines (probably intel_ring_advance()).
The other thing I did was to relax the restriction to only emit in qword
aligned packets (by fixing up the tail for qword alignment on sealing the
request). Also, I would rather the function be expressed as operating on
the request, i915_gem_request_emit() was my choice.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
More information about the Intel-gfx
mailing list