[Intel-gfx] [PATCH 17/32] drm/i915: Remove the lazy_coherency parameter from request-completed?

Dave Gordon david.s.gordon at intel.com
Mon Jan 4 03:16:04 PST 2016


On 14/12/15 15:11, Chris Wilson wrote:
> On Mon, Dec 14, 2015 at 02:59:30PM +0000, Tvrtko Ursulin wrote:
>>
>> Hi,
>>
>> On 11/12/15 11:33, Chris Wilson wrote:
>>> Now that we have split out the seqno-barrier from the
>>> engine->get_seqno() callback itself, we can move the users of the
>>> seqno-barrier to the required callsites simplifying the common code and
>>> making the required workaround handling much more explicit.
>>
>> What bothers me about this patch, and the one preceding it, is that
>> I don't see a tangible improvement for the programmer who still has
>> to know when to read the seqno and when to "read it harder, read for
>> real".
>
> In earlier patches, I called it irq_barrier.
>
> It's not reading it harder. It's just that there is a ordering issue
> with receiving an interrupt and the seqno write being visible.
>
>> Barrier in this sense has a relation to the state of things but
>> somehow feels too low level to me when used from the code. But to be
>> fair I am not sure how to better define it.
>>
>> Would ring->get_seqno paired with ring->read_seqno perhaps make
>> sense? Implementation for ring->read_seqno would just be a flush
>> followed by ring->get_seqno then. Or maybe keep the barrier and add
>> ring->read_seqno which would be ring->seqno_barrier +
>> ring_get_seqno?
>
> No.
> -Chris

We could instead put the knowledge about whether and how to read "for 
real" inside the read-the-seqno function. For example:

	struct intel_engine_cs {
		...
		u32 last_seqno_seen;
	}

	u32 intel_ring_read_seqno(struct intel_engine_cs *engine) {
		// First try simple read
		u32 seqno = intel_ring_get_seqno(engine);

		if (seqno == engine->last_seqno_seen) {
			// Do additional flush, then try again
			engine->seqno_barrier(engine);
			seqno = intel_ring_get_seqno(engine);
		}

		engine->last_seqno_seen = seqno;
		return seqno;
	}

Then callers don't have to know anything about coherency; they can just 
assume that they will automatically get the latest value.

In the presumably common case where the value *has* been updated, and 
the cache has noted the update and invalidated the old local value, the 
first read will successfully find the updated seqno and return quickly.

Then we only do the extra work in the case where there would otherwise 
appear to be nothing to do (i.e. when we would sleep, or spin, or 
otherwise wait, if this function were to return the same value as last 
time).

.Dave.


More information about the Intel-gfx mailing list