[Intel-gfx] [PATCH] drm/i915: Reserve space improvements
Tomas Elf
tomas.elf at intel.com
Tue Jun 30 09:15:29 PDT 2015
On 30/06/2015 16:53, John Harrison wrote:
> On 30/06/2015 15:43, Tomas Elf wrote:
>> On 30/06/2015 12:40, John.C.Harrison at Intel.com wrote:
>>> From: John Harrison <John.C.Harrison at Intel.com>
>>>
>>> An earlier patch was added to reserve space in the ring buffer for the
>>> commands issued during 'add_request()'. The initial version was
>>> pessimistic in the way it handled buffer wrapping and would cause
>>> premature wraps and thus waste ring space.
>>>
>>> This patch updates the code to better handle the wrap case. It no
>>> longer enforces that the space being asked for and the reserved space
>>> are a single contiguous block. Instead, it allows the reserve to be on
>>> the far end of a wrap operation. It still guarantees that the space is
>>> available so when the wrap occurs, no wait will happen. Thus the wrap
>>> cannot fail which is the whole point of the exercise.
>>>
>>> Also fixed a merge failure with some comments from the original patch.
>>>
>>> v2: Incorporated suggestion by David Gordon to move the wrap code
>>> inside the prepare function and thus allow a single combined
>>> wait_for_space() call rather than doing one before the wrap and
>>> another after. This also makes the prepare code much simpler and
>>> easier to follow.
>>>
>>> v3: Fix for 'effective_size' vs 'size' during ring buffer remainder
>>> calculations (spotted by Tomas Elf).
>>>
>>> For: VIZ-5115
>>> CC: Daniel Vetter <daniel at ffwll.ch>
>>> Signed-off-by: John Harrison <John.C.Harrison at Intel.com>
>>> ---
>>> drivers/gpu/drm/i915/intel_lrc.c | 73
>>> +++++++++++++-------------
>>> drivers/gpu/drm/i915/intel_ringbuffer.c | 90
>>> +++++++++++++++++++--------------
>>> drivers/gpu/drm/i915/intel_ringbuffer.h | 4 +-
>>> 3 files changed, 90 insertions(+), 77 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c
>>> b/drivers/gpu/drm/i915/intel_lrc.c
>>> index b36e064..7d9515d 100644
>>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>>> @@ -663,12 +663,12 @@ static int logical_ring_wait_for_space(struct
>>> drm_i915_gem_request *req,
>>> unsigned space;
>>> int ret;
>>>
>>> - /* The whole point of reserving space is to not wait! */
>>> - WARN_ON(ringbuf->reserved_in_use);
>>> -
>>> if (intel_ring_space(ringbuf) >= bytes)
>>> return 0;
>>>
>>> + /* The whole point of reserving space is to not wait! */
>>> + WARN_ON(ringbuf->reserved_in_use);
>>> +
>>> list_for_each_entry(target, &ring->request_list, list) {
>>> /*
>>> * The request queue is per-engine, so can contain requests
>>> @@ -718,22 +718,11 @@ intel_logical_ring_advance_and_submit(struct
>>> drm_i915_gem_request *request)
>>> execlists_context_queue(request);
>>> }
>>>
>>> -static int logical_ring_wrap_buffer(struct drm_i915_gem_request *req)
>>> +static void __wrap_ring_buffer(struct intel_ringbuffer *ringbuf)
>>> {
>>> - struct intel_ringbuffer *ringbuf = req->ringbuf;
>>> uint32_t __iomem *virt;
>>> int rem = ringbuf->size - ringbuf->tail;
>>>
>>> - /* Can't wrap if space has already been reserved! */
>>> - WARN_ON(ringbuf->reserved_in_use);
>>> -
>>> - if (ringbuf->space < rem) {
>>> - int ret = logical_ring_wait_for_space(req, rem);
>>> -
>>> - if (ret)
>>> - return ret;
>>> - }
>>> -
>>> virt = ringbuf->virtual_start + ringbuf->tail;
>>> rem /= 4;
>>> while (rem--)
>>> @@ -741,40 +730,50 @@ static int logical_ring_wrap_buffer(struct
>>> drm_i915_gem_request *req)
>>>
>>> ringbuf->tail = 0;
>>> intel_ring_update_space(ringbuf);
>>> -
>>> - return 0;
>>> }
>>>
>>> static int logical_ring_prepare(struct drm_i915_gem_request *req,
>>> int bytes)
>>> {
>>> struct intel_ringbuffer *ringbuf = req->ringbuf;
>>> - int ret;
>>> -
>>> - /*
>>> - * Add on the reserved size to the request to make sure that after
>>> - * the intended commands have been emitted, there is guaranteed to
>>> - * still be enough free space to send them to the hardware.
>>> - */
>>> - if (!ringbuf->reserved_in_use)
>>> - bytes += ringbuf->reserved_size;
>>> -
>>> - if (unlikely(ringbuf->tail + bytes > ringbuf->effective_size)) {
>>> - ret = logical_ring_wrap_buffer(req);
>>> - if (unlikely(ret))
>>> - return ret;
>>> + int remain_usable = ringbuf->effective_size - ringbuf->tail;
>>> + int remain_actual = ringbuf->size - ringbuf->tail;
>>
>> You could add a comment here (and in the legacy implementation)
>> explaining why we make the distinction between remain_usable and
>> remain_actual. Or you could add the comment when we actually use
>> remain_actual further down in the function. It's up to you.
>>
>> We first need to be pessimistic about how much space is left in the
>> ring buffer when determining the need for wrapping - therefore we use
>> remain_usable, which disregards the end-of-buffer padding - and then
>> we need to be pessimistic about how much ring space we need to wait
>> for - therefore we use remain_actual, which takes the end-of-buffer
>> padding into consideration to make sure that we're actually not
>> waiting for too little space.
>
> It's not about being pessimistic or optimistic. That implies there is
> some choice, that one version is a little bit better in one situation
> but either would do. Whereas this is about functional correctness. The
> difference between the actual ring size and the usable ring size is not
> some reserved space thing to make something go faster. This is about the
> hardware locking up if the entire buffer is used. I think 'usable' and
> 'actual' are fairly obvious names. If you want to know the details about
> why there is an 'effective_size' in the first place then look up
> 'effective_size' in the code and read the comment about i830 hangs.
You're probably right. Let's just forget about it.
>
>> If you add those comments maybe you could also rename the variables to
>> something like "remaining_space_usable" etc. since "remain_usable" is
>> more of a verb than a noun. But that's seriously nitpicking.
> Or maybe
> 'remaining_space_that_is_usable_without_causing_an_i830_to_hang_because_it_skips_the_last_two_cachelines'?
> There is such a thing as being too verbose and making the code hard to
> read.
That is indeed a very long variable name. It's 104 characters long so
that wouldn't clear checkpatch.pl :). My nitpicky suggestion was 9
characters longer. And, yeah, it's possible to get too verbose. This
driver is not even remotely close to that point as it stands today :).
Thanks,
Tomas
>
>
>> If you at least add the comments to make it crystal clear why we do it
>> this way then I'm fine.
>
> I don't see what comment could be added that would actually make things
> clearer without being a long and unnecessary description of the i830
> issue. There are two variables declared on consecutive lines that cache
> pretty simple calculations and have fairly clear names. It looks quite
> self explanatory to me! The code then tests to see if it can use the
> usable space, if not then it wraps around the actual buffer size. Again,
> seems pretty obvious as to what is going on and why - you can only use
> the usable bit, but when traversing the whole buffer you pretty
> obviously need to traverse the whole buffer not just the part that may
> or may not have been used.
>
>
>>
>> Reviewed-by: Tomas Elf <tomas.elf at intel.com>
>>
>> Thanks,
>> Tomas
>>
>>> + int ret, total_bytes, wait_bytes = 0;
>>> + bool need_wrap = false;
>>>
>>> - if(ringbuf->reserved_size) {
>>> - uint32_t size = ringbuf->reserved_size;
>>> + if (ringbuf->reserved_in_use)
>>> + total_bytes = bytes;
>>> + else
>>> + total_bytes = bytes + ringbuf->reserved_size;
>>>
>>> - intel_ring_reserved_space_cancel(ringbuf);
>>> - intel_ring_reserved_space_reserve(ringbuf, size);
>>> + if (unlikely(bytes > remain_usable)) {
>>> + /*
>>> + * Not enough space for the basic request. So need to flush
>>> + * out the remainder and then wait for base + reserved.
>>> + */
>>> + wait_bytes = remain_actual + total_bytes;
>>> + need_wrap = true;
>>> + } else {
>>> + if (unlikely(total_bytes > remain_usable)) {
>>> + /*
>>> + * The base request will fit but the reserved space
>>> + * falls off the end. So only need to to wait for the
>>> + * reserved size after flushing out the remainder.
>>> + */
>>> + wait_bytes = remain_actual + ringbuf->reserved_size;
>>> + need_wrap = true;
>>> + } else if (total_bytes > ringbuf->space) {
>>> + /* No wrapping required, just waiting. */
>>> + wait_bytes = total_bytes;
>>> }
>>> }
>>>
>>> - if (unlikely(ringbuf->space < bytes)) {
>>> - ret = logical_ring_wait_for_space(req, bytes);
>>> + if (wait_bytes) {
>>> + ret = logical_ring_wait_for_space(req, wait_bytes);
>>> if (unlikely(ret))
>>> return ret;
>>> +
>>> + if (need_wrap)
>>> + __wrap_ring_buffer(ringbuf);
>>> }
>>>
>>> return 0;
>>> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c
>>> b/drivers/gpu/drm/i915/intel_ringbuffer.c
>>> index af7c12e..e39c891 100644
>>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
>>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
>>> @@ -2121,12 +2121,12 @@ static int ring_wait_for_space(struct
>>> intel_engine_cs *ring, int n)
>>> unsigned space;
>>> int ret;
>>>
>>> - /* The whole point of reserving space is to not wait! */
>>> - WARN_ON(ringbuf->reserved_in_use);
>>> -
>>> if (intel_ring_space(ringbuf) >= n)
>>> return 0;
>>>
>>> + /* The whole point of reserving space is to not wait! */
>>> + WARN_ON(ringbuf->reserved_in_use);
>>> +
>>> list_for_each_entry(request, &ring->request_list, list) {
>>> space = __intel_ring_space(request->postfix, ringbuf->tail,
>>> ringbuf->size);
>>> @@ -2145,21 +2145,11 @@ static int ring_wait_for_space(struct
>>> intel_engine_cs *ring, int n)
>>> return 0;
>>> }
>>>
>>> -static int intel_wrap_ring_buffer(struct intel_engine_cs *ring)
>>> +static void __wrap_ring_buffer(struct intel_ringbuffer *ringbuf)
>>> {
>>> uint32_t __iomem *virt;
>>> - struct intel_ringbuffer *ringbuf = ring->buffer;
>>> int rem = ringbuf->size - ringbuf->tail;
>>>
>>> - /* Can't wrap if space has already been reserved! */
>>> - WARN_ON(ringbuf->reserved_in_use);
>>> -
>>> - if (ringbuf->space < rem) {
>>> - int ret = ring_wait_for_space(ring, rem);
>>> - if (ret)
>>> - return ret;
>>> - }
>>> -
>>> virt = ringbuf->virtual_start + ringbuf->tail;
>>> rem /= 4;
>>> while (rem--)
>>> @@ -2167,8 +2157,6 @@ static int intel_wrap_ring_buffer(struct
>>> intel_engine_cs *ring)
>>>
>>> ringbuf->tail = 0;
>>> intel_ring_update_space(ringbuf);
>>> -
>>> - return 0;
>>> }
>>>
>>> int intel_ring_idle(struct intel_engine_cs *ring)
>>> @@ -2238,9 +2226,21 @@ void intel_ring_reserved_space_use(struct
>>> intel_ringbuffer *ringbuf)
>>> void intel_ring_reserved_space_end(struct intel_ringbuffer *ringbuf)
>>> {
>>> WARN_ON(!ringbuf->reserved_in_use);
>>> - WARN(ringbuf->tail > ringbuf->reserved_tail +
>>> ringbuf->reserved_size,
>>> - "request reserved size too small: %d vs %d!\n",
>>> - ringbuf->tail - ringbuf->reserved_tail,
>>> ringbuf->reserved_size);
>>> + if (ringbuf->tail > ringbuf->reserved_tail) {
>>> + WARN(ringbuf->tail > ringbuf->reserved_tail +
>>> ringbuf->reserved_size,
>>> + "request reserved size too small: %d vs %d!\n",
>>> + ringbuf->tail - ringbuf->reserved_tail,
>>> ringbuf->reserved_size);
>>> + } else {
>>> + /*
>>> + * The ring was wrapped while the reserved space was in use.
>>> + * That means that some unknown amount of the ring tail was
>>> + * no-op filled and skipped. Thus simply adding the ring size
>>> + * to the tail and doing the above space check will not work.
>>> + * Rather than attempt to track how much tail was skipped,
>>> + * it is much simpler to say that also skipping the sanity
>>> + * check every once in a while is not a big issue.
>>> + */
>>> + }
>>>
>>> ringbuf->reserved_size = 0;
>>> ringbuf->reserved_in_use = false;
>>> @@ -2249,33 +2249,45 @@ void intel_ring_reserved_space_end(struct
>>> intel_ringbuffer *ringbuf)
>>> static int __intel_ring_prepare(struct intel_engine_cs *ring, int
>>> bytes)
>>> {
>>> struct intel_ringbuffer *ringbuf = ring->buffer;
>>> - int ret;
>>> -
>>> - /*
>>> - * Add on the reserved size to the request to make sure that after
>>> - * the intended commands have been emitted, there is guaranteed to
>>> - * still be enough free space to send them to the hardware.
>>> - */
>>> - if (!ringbuf->reserved_in_use)
>>> - bytes += ringbuf->reserved_size;
>>> -
>>> - if (unlikely(ringbuf->tail + bytes > ringbuf->effective_size)) {
>>> - ret = intel_wrap_ring_buffer(ring);
>>> - if (unlikely(ret))
>>> - return ret;
>>> + int remain_usable = ringbuf->effective_size - ringbuf->tail;
>>> + int remain_actual = ringbuf->size - ringbuf->tail;
>>> + int ret, total_bytes, wait_bytes = 0;
>>> + bool need_wrap = false;
>>>
>>> - if(ringbuf->reserved_size) {
>>> - uint32_t size = ringbuf->reserved_size;
>>> + if (ringbuf->reserved_in_use)
>>> + total_bytes = bytes;
>>> + else
>>> + total_bytes = bytes + ringbuf->reserved_size;
>>>
>>> - intel_ring_reserved_space_cancel(ringbuf);
>>> - intel_ring_reserved_space_reserve(ringbuf, size);
>>> + if (unlikely(bytes > remain_usable)) {
>>> + /*
>>> + * Not enough space for the basic request. So need to flush
>>> + * out the remainder and then wait for base + reserved.
>>> + */
>>> + wait_bytes = remain_actual + total_bytes;
>>> + need_wrap = true;
>>> + } else {
>>> + if (unlikely(total_bytes > remain_usable)) {
>>> + /*
>>> + * The base request will fit but the reserved space
>>> + * falls off the end. So only need to to wait for the
>>> + * reserved size after flushing out the remainder.
>>> + */
>>> + wait_bytes = remain_actual + ringbuf->reserved_size;
>>> + need_wrap = true;
>>> + } else if (total_bytes > ringbuf->space) {
>>> + /* No wrapping required, just waiting. */
>>> + wait_bytes = total_bytes;
>>> }
>>> }
>>>
>>> - if (unlikely(ringbuf->space < bytes)) {
>>> - ret = ring_wait_for_space(ring, bytes);
>>> + if (wait_bytes) {
>>> + ret = ring_wait_for_space(ring, wait_bytes);
>>> if (unlikely(ret))
>>> return ret;
>>> +
>>> + if (need_wrap)
>>> + __wrap_ring_buffer(ringbuf);
>>> }
>>>
>>> return 0;
>>> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h
>>> b/drivers/gpu/drm/i915/intel_ringbuffer.h
>>> index 0e2bbc6..304cac4 100644
>>> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
>>> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
>>> @@ -473,7 +473,6 @@ static inline u32 intel_ring_get_tail(struct
>>> intel_ringbuffer *ringbuf)
>>> * will always have sufficient room to do its stuff. The request
>>> creation
>>> * code calls this automatically.
>>> */
>>> -int intel_ring_reserve_space(struct drm_i915_gem_request *request);
>>> void intel_ring_reserved_space_reserve(struct intel_ringbuffer
>>> *ringbuf, int size);
>>> /* Cancel the reservation, e.g. because the request is being
>>> discarded. */
>>> void intel_ring_reserved_space_cancel(struct intel_ringbuffer
>>> *ringbuf);
>>> @@ -482,4 +481,7 @@ void intel_ring_reserved_space_use(struct
>>> intel_ringbuffer *ringbuf);
>>> /* Finish with the reserved space - for use by i915_add_request()
>>> only. */
>>> void intel_ring_reserved_space_end(struct intel_ringbuffer *ringbuf);
>>>
>>> +/* Legacy ringbuffer specific portion of reservation code: */
>>> +int intel_ring_reserve_space(struct drm_i915_gem_request *request);
>>> +
>>> #endif /* _INTEL_RINGBUFFER_H_ */
>>>
>>
>
More information about the Intel-gfx
mailing list