[Intel-gfx] [PATCH 17/32] drm/i915: Remove the lazy_coherency parameter from request-completed?

Mon Jan 4 09:28:32 PST 2016

On 04/01/16 14:20, Chris Wilson wrote:
> On Mon, Jan 04, 2016 at 02:09:53PM +0000, Dave Gordon wrote:
>> On 04/01/16 13:02, Dave Gordon wrote:
>>> On 04/01/16 11:26, Chris Wilson wrote:
>>>> On Mon, Jan 04, 2016 at 11:16:04AM +0000, Dave Gordon wrote:
>>>>> On 14/12/15 15:11, Chris Wilson wrote:
>>>>>> On Mon, Dec 14, 2015 at 02:59:30PM +0000, Tvrtko Ursulin wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> On 11/12/15 11:33, Chris Wilson wrote:
>>>>>>>> Now that we have split out the seqno-barrier from the
>>>>>>>> engine->get_seqno() callback itself, we can move the users of the
>>>>>>>> seqno-barrier to the required callsites simplifying the common
>>>>>>>> code and
>>>>>>>> making the required workaround handling much more explicit.
>>>>>>>
>>>>>>> What bothers me about this patch, and the one preceding it, is that
>>>>>>> I don't see a tangible improvement for the programmer who still has
>>>>>>> to know when to read the seqno and when to "read it harder, read for
>>>>>>> real".
>>>>>>
>>>>>> In earlier patches, I called it irq_barrier.
>>>>>>
>>>>>> It's not reading it harder. It's just that there is a ordering issue
>>>>>> with receiving an interrupt and the seqno write being visible.
>>>>>>
>>>>>>> Barrier in this sense has a relation to the state of things but
>>>>>>> somehow feels too low level to me when used from the code. But to be
>>>>>>> fair I am not sure how to better define it.
>>>>>>>
>>>>>>> Would ring->get_seqno paired with ring->read_seqno perhaps make
>>>>>>> sense? Implementation for ring->read_seqno would just be a flush
>>>>>>> followed by ring->get_seqno then. Or maybe keep the barrier and add
>>>>>>> ring->read_seqno which would be ring->seqno_barrier +
>>>>>>> ring_get_seqno?
>>>>>>
>>>>>> No.
>>>>>> -Chris
>>>>>
>>>>> We could instead put the knowledge about whether and how to read
>>>>> "for real" inside the read-the-seqno function. For example:
>>>>
>>>> You do appreciate the irony that you are on the reviewer list for patches
>>>> that do that?
>>>>
>>>> http://cgit.freedesktop.org/~ickle/linux-2.6/commit/?h=breadcrumbs&id=34409f2d965001d7d63f21a1c5339b07eed6af34
>>>
>>> No, I haven't got as far as that one, since it was posted over a week
>>> after the message at the head of this thread. Anyway, I still can't see
>>> in that patch anything equivalent to what I described above.
>>
>> Oh, I spotted what you meant, but it's not in /that/ patch
>> (which was a version of PATCH 15/32 "Slaughter the thundering
>> i915_wait_request herd") it's in PATCH 19/32 "Check the CPU
>> cached value of seqno after waking the waiter".
>>
>> Even so, it's not at the same level of code structure; I was
>> suggesting pushing it all the way down, because
>> __i915_wait_request() and/or i915_gem_request_completed() aren't the
>> only functions that use it.
>
> No. I am arguing that there should be precisely one piece of code
> responsible for seqno-vs-interrupt ordering. Everywhere else should not
> have to worry about that interrupts may be asserted before the HWS write
> is posted.
> -Chris

That's what /I/ said. So it should be inside the /lowest-level/ 
function, the one that hands back a h/w sequence number to other code 
that doesn't care how it was obtained; in other words, either 
intel_ring_get_seqno() or a wrapper round it that incorporates the 
check-and-flush, if you still want the option of looking only at the 
cached copy.

Whereas you put the read-barrier-read logic in __i915_wait_request() but 
that's too high a level, and I don't think that's the only caller where 
it could be advantageous to conditionally update the cached seqno from 
the HWSP; and the fact of reading the HWSP is hidden inside the call to 
i915_gem_request(), which is working at an entirely different level of 
abstraction.

.Dave.