[Intel-gfx] [PATCH 17/32] drm/i915: Remove the lazy_coherency parameter from request-completed?
david.s.gordon at intel.com
Mon Jan 4 05:02:25 PST 2016
On 04/01/16 11:26, Chris Wilson wrote:
> On Mon, Jan 04, 2016 at 11:16:04AM +0000, Dave Gordon wrote:
>> On 14/12/15 15:11, Chris Wilson wrote:
>>> On Mon, Dec 14, 2015 at 02:59:30PM +0000, Tvrtko Ursulin wrote:
>>>> On 11/12/15 11:33, Chris Wilson wrote:
>>>>> Now that we have split out the seqno-barrier from the
>>>>> engine->get_seqno() callback itself, we can move the users of the
>>>>> seqno-barrier to the required callsites simplifying the common code and
>>>>> making the required workaround handling much more explicit.
>>>> What bothers me about this patch, and the one preceding it, is that
>>>> I don't see a tangible improvement for the programmer who still has
>>>> to know when to read the seqno and when to "read it harder, read for
>>> In earlier patches, I called it irq_barrier.
>>> It's not reading it harder. It's just that there is a ordering issue
>>> with receiving an interrupt and the seqno write being visible.
>>>> Barrier in this sense has a relation to the state of things but
>>>> somehow feels too low level to me when used from the code. But to be
>>>> fair I am not sure how to better define it.
>>>> Would ring->get_seqno paired with ring->read_seqno perhaps make
>>>> sense? Implementation for ring->read_seqno would just be a flush
>>>> followed by ring->get_seqno then. Or maybe keep the barrier and add
>>>> ring->read_seqno which would be ring->seqno_barrier +
>> We could instead put the knowledge about whether and how to read
>> "for real" inside the read-the-seqno function. For example:
> You do appreciate the irony that you are on the reviewer list for patches
> that do that?
No, I haven't got as far as that one, since it was posted over a week
after the message at the head of this thread. Anyway, I still can't see
in that patch anything equivalent to what I described above.
> There is just one place that we need the extra work, after the
> interrupt. All other places only care about the current value of the
> seqno in the CPU cache.
So, we could instead have a per-engine flag which says, interrupt has
happened, next reader should expect an update (and try hard to see one)?
Flow would be:
sets flag, wakes first waiter, clears IRQs
waiter wakes up
reads seqno (read fn sees flag => coherent, clear flag if new)
checks whether (only) its request is completed
not complete: enable IRQ, sleep
seqno match: dequeue request, enable IRQ, process completion
match+: dequeue request, wake next, process completion
where 'process completion' involves updating the request state and
waking all /additional/ waiters on the same request, whereas 'wake next'
refers to threads waiting on the /next/ request.
Is that what your patch is trying to achieve? It's a bit hard to tell
with the seqno-read-optimisation and irq-dispatch changes mixed in with
the r-b-tree and all the other things in this sequence.
I think it would be easier to understand if some of the more obvious
improvements (such as 18/32, 20/32, 24/32) were pushed in earlier, so
that the code is as clear as possible before the patches that actually
change the way things work are applied. And all the reset-related
patches could be later, as that's an area with some subtlety to it.
More information about the Intel-gfx