[Intel-gfx] [PATCH 2/2] drm/i915: Rework order of operations in {__intel, logical}_ring_prepare()

Fri Jun 12 11:54:17 PDT 2015

On 12/06/15 19:05, Chris Wilson wrote:
> On Fri, Jun 12, 2015 at 06:09:08PM +0100, Dave Gordon wrote:
>> The original idea of preallocating the OLR was implemented in
>>
>>> 9d773091 drm/i915: Preallocate next seqno before touching the ring
>>
>> and the sequence of operations was to allocate the OLR, then wrap past
>> the end of the ring if necessary, then wait for space if necessary.
>> But subsequently intel_ring_begin() was refactored, in
>>
>>> 304d695 drm/i915: Flush outstanding requests before allocating new seqno
>>
>> to ensure that pending work that might need to be flushed used the old
>> and not the newly-allocated request. This changed the sequence to wrap
>> and/or wait, then allocate, although the comment still said
>> 	/* Preallocate the olr before touching the ring */
>> which was no longer true as intel_wrap_ring_buffer() touches the ring.
>>
>> However, with the introduction of dynamic pinning, in
>>
>>> 7ba717c drm/i915/bdw: Pin the ringbuffer backing object to GGTT on-demand
>>
>> came the possibility that the ringbuffer might not be pinned to the GTT
>> or mapped into CPU address space when intel_ring_begin() is called. It
>> gets pinned when the request is allocated, so it's now important that
>> this comes *before* anything that can write into the ringbuffer, in this
>> case intel_wrap_ring_buffer(), as this will fault if (a) the ringbuffer
>> happens not to be mapped, and (b) tail happens to be sufficiently close
>> to the end of the ring to trigger wrapping.
> 
> On the other hand, the request allocation can itself write into the
> ring. This is not the right fix, that is the elimination of olr itself
> and passing the request into intel_ring_begin. That way we can explicit
> in our ordering into ring access.
> -Chris

AFAICS, request allocation can write into the ring only if it actually
has to flush some *pre-existing* OLR. [Aside: it can actually trigger
writing into a completely different ringbuffer, but not the one we're
handling here!] The worst-case sequence is:

	i915_gem_request_alloc		finds there's no OLR
	  i915_gem_get_seqno		  finds the seqno is 0
	    i915_gem_init_seqno		    for_eash_ring do ...
	      intel_ring_idle		      but no OLR, so OK

It only works because i915_gem_request_alloc() allocates the request
early but doesn't store it in the OLR until the end.

OTOH I agree that the long-term answer is the elimination of the OLR;
this is really something of a stopgap until John H's Anti-OLR patchset
is merged.

Although, the simplification of the wait-wrap/wait-space sequence is
probably worthwhile in its own right, so if Anti-OLR gets merged first
we can put the rest of the changes on top of that. It's only code inside
the "if(!OLR)" section that would need to be removed.

.Dave.