[Intel-gfx] [PATCH 17/49] drm/i915: Implement inter-engine read-read optimisations

Mon Mar 30 07:45:04 PDT 2015

On 03/30/2015 03:09 PM, Chris Wilson wrote:
> On Mon, Mar 30, 2015 at 02:52:26PM +0100, Tvrtko Ursulin wrote:
>>> +static void
>>> +__i915_gem_request_retire__upto(struct drm_i915_gem_request *rq)
>>
>> It is a bit annoying (for readability) that it can be rq, req and request.
>
> Nonsense they are all rq and struct i915_request. Or once have been and
> so will again. /prophecy

rq is least spread in the codebase, and even the worst option of the 
three since it sounds like a queue of some sort.

>>> +err:
>>> +	for (i = 0; i < n; i++) {
>>> +		if (ret == 0) {
>>> +			int ring = requests[i]->ring->id;
>>> +			if (obj->last_read_req[ring] == requests[i])
>>> +				i915_gem_object_retire__read(obj, ring);
>>> +			if (obj->last_write_req == requests[i])
>>> +				i915_gem_object_retire__write(obj);
>>
>> Above four lines seem to be identical functionality to similar four
>> in __i915_gem_object_sync.
>
> Yes. Extracting them ended up looking worse (imo).

It would be a single function call taking object and request, how can it 
be worse? Should be more readable with a good name.

>> Also, _retire__read will do _retire__write if there is one on the
>> same ring. And here by definition they are since it is the same
>> request, no?
>
> No. It's subtle but here is the bug I pointed out from before. Once we
> drop the lock, we no longer can make assumptions about the state of obj.

You mean request might have disappeared from last_read_req but is still 
on last_write_req? But how, since if it was retired that shouldn't be 
possible.

>>> @@ -2698,16 +2747,13 @@ i915_gem_retire_requests_ring(struct intel_engine_cs *ring)
>>>   		if (!i915_gem_request_completed(request, true))
>>>   			break;
>>>
>>> -		trace_i915_gem_request_retire(request);
>>> -
>>>   		/* We know the GPU must have read the request to have
>>>   		 * sent us the seqno + interrupt, so use the position
>>>   		 * of tail of the request to update the last known position
>>>   		 * of the GPU head.
>>>   		 */
>>>   		request->ringbuf->last_retired_head = request->postfix;
>>> -
>>> -		i915_gem_free_request(request);
>>> +		i915_gem_request_retire(request);
>>>   	}
>>
>> This loop could also use __i915_gem_request_retire__upto if it found
>> the first completed request first. Not sure how much code would that
>> save but would maube be more readable, a little bit more self
>> documenting.
>
> Actually this loop here should be pushed back to the engine (as part of
> later patches). After that transformation, using i915_gem_request_retire()
> is even clearer. But _retire__upto does become the main way in which we
> retire requests (having killed off retire_requests_ring in favour of
> explict wait/poll+retire).

That sounds good.

>> So far it all looks reasonable to me, but apart from the comments
>> above, I want to do another pass anyway.
>
> There's a few more changes afoot as well (minor ones concerning
> retire__upto and unexporting retire_requests_rig).

Ok.

Regards,

Tvrtko