[Intel-gfx] [PATCH 51/59] drm/i915: Add *_ring_begin() to request allocation

Mon Mar 23 02:10:30 PDT 2015

On Fri, Mar 20, 2015 at 04:09:55PM +0000, John Harrison wrote:
> On 20/03/2015 15:30, Chris Wilson wrote:
> >On Fri, Mar 20, 2015 at 04:23:45PM +0100, Daniel Vetter wrote:
> >>On Thu, Mar 19, 2015 at 12:30:56PM +0000, John.C.Harrison at Intel.com wrote:
> >>>diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> >>>index 6f198df..c7dcabd 100644
> >>>--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> >>>+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> >>>@@ -2205,21 +2205,27 @@ int intel_ring_alloc_request_extras(struct drm_i915_gem_request *request)
> >>>  	return 0;
> >>>  }
> >>>-int intel_ring_reserved_space_reserve(struct intel_ringbuffer *ringbuf, int size)
> >>>+int legacy_ring_reserved_space_reserve(struct drm_i915_gem_request *request)
> >>>  {
> >>>-	/* NB: Until request management is fully tidied up and the OLR is
> >>>-	 * removed, there are too many ways for get false hits on this
> >>>-	 * anti-recursion check! */
> >>>-	/*WARN_ON(ringbuf->reserved_size);*/
> >>>+	/*
> >>>+	 * The first call merely notes the reserve request and is common for
> >>>+	 * all back ends. The subsequent localised _begin() call actually
> >>>+	 * ensures that the reservation is available. Without the begin, if
> >>>+	 * the request creator immediately submitted the request without
> >>>+	 * adding any commands to it then there might not actually be
> >>>+	 * sufficient room for the submission commands.
> >>>+	 */
> >>>+	intel_ring_reserved_space_reserve(request->ringbuf, MIN_SPACE_FOR_ADD_REQUEST);
> >>>+
> >>>+	return intel_ring_begin(request, 0);
> >>This feels a bit convoluted tbh, and would fall aparat if we start adding
> >>sanity checks to _begin/advance functions. Can't we instead directly call
> >>ring_wait_for_space? This forgoes the intel_wrap_ring_buffer call, but
> >>otoh we just need to factor that into our estimates. Wrapping the ring for
> >>the entire reservartion right away is
> >>a) way too much - we only wrap individual ring_being calls anyway
> >>b) not doing any good since all the intermediate intel_ring_emit calls
> >>might very-well push us into a wrap anyway.
> >>
> >>In the end we just need to increase our reservation with the biggest
> >>intel_ring_begin we have in the add_request code - that's the worst-case
> >>of ring space we might waste due to wrapping.
>
> I don't see why sanity checks in begin/advance would be a problem. This
> _begin() call is only required in the situation where no-one actually adds
> any commands to the ring before calling i915_add_request() and then is only
> necessary because i915_add_request() is not allowed to call _begin() any
> more because it is not allowed to fail. There is no magic going on behind
> the scenes. It is exactly the same as calling '_begin(RESERVED_SIZE)'. All
> subsequent code must still do it's own _begin() call before adding commands
> to the ring. Any sanity checks would ignore the reserved size and complain
> if a _begin() is too small for the commands actually written. It should all
> just work :). Plus it is nice and simple and easy to maintain.
> 
> Also, I don't see why it is better to do the wrap as late as possible. If a
> wrap is going to be required at some point then it doesn't really matter
> when it is done - early or late, the (small) hit still has to be taken.
> Given that the buffer size is orders of magnitude larger than the size of an
> individual request, there is no possibility of ever having to wrap twice. So
> I would say that keeping the code nice and simple outweighs any advantage to
> wrapping a few calls later.

Ok two examples:

1. We have 100 bytes add request tail that we reserve, but only 80 bytes
left before wrapping. Your reservation code will wrap.

Let's also assume we have 160 bytes of execbuf stuff in 20bytes block. If
we would not have wrapped right away we'd save those 80 bytes of wrapping.
Eager wrapping just wasted space.

2. Again 100 bytes of reservation, 160 bytes of execbuf. But this time
around assume we have 200 bytes left before wrap. Your reservation
doesn't wrap since there's still 100 bytes left. But after the execbuf
there's only 40 bytes left, requiring a wrap. Eager wrapping didn't help
at all for correct reservation. Which means we _have_ to include the
worst-case loss for wrapping into our reservation, and more important the
reservation check _must_ handle wrapping correctly. Since that's the case
where things will fall appart if the reservation is just a bit too small.

Hence my recommendation to only reserve the space and not bother with
wrapping (it's a net loss) and also make sure we catch any kind of
overflow due to wrapping.

We could instead try to keep wrapping into account for the reserved space,
but I think that's more fragile. It's also more wasteful since generally
we don't need a 100 byte contiguous block but just a few dwords for each
individual command. So effective wrapping overhead is a lot less than
100-4.

Finally imo shrugging the double-wrap off isn't a good idea since emitting
a lot of crap by accident is a very likely bug. And with the guc
scheduling mode I've heard that the rings will be just 4k ...
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch