[Intel-gfx] [PATCH 14/21] drm/i915: Only apply one barrier after a breadcrumb interrupt is posted

Wed Jun 8 09:57:26 UTC 2016

On 08/06/16 10:35, Chris Wilson wrote:
> On Mon, Jun 06, 2016 at 04:34:27PM +0100, Tvrtko Ursulin wrote:
>>
>> On 03/06/16 17:08, Chris Wilson wrote:
>>> If we flag the seqno as potentially stale upon receiving an interrupt,
>>> we can use that information to reduce the frequency that we apply the
>>> heavyweight coherent seqno read (i.e. if we wake up a chain of waiters).
>>>
>>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>>> ---
>>>   drivers/gpu/drm/i915/i915_drv.h          | 15 ++++++++++++++-
>>>   drivers/gpu/drm/i915/i915_irq.c          |  1 +
>>>   drivers/gpu/drm/i915/intel_breadcrumbs.c | 16 ++++++++++------
>>>   drivers/gpu/drm/i915/intel_ringbuffer.h  |  1 +
>>>   4 files changed, 26 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>>> index 4ddb9ff319cb..a71d08199d57 100644
>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>> @@ -3935,7 +3935,20 @@ static inline bool __i915_request_irq_complete(struct drm_i915_gem_request *req)
>>>   	 * but it is easier and safer to do it every time the waiter
>>>   	 * is woken.
>>>   	 */
>>> -	if (engine->irq_seqno_barrier) {
>>> +	if (engine->irq_seqno_barrier && READ_ONCE(engine->irq_posted)) {
>>> +		/* The ordering of irq_posted versus applying the barrier
>>> +		 * is crucial. The clearing of the current irq_posted must
>>> +		 * be visible before we perform the barrier operation,
>>> +		 * such that if a subsequent interrupt arrives, irq_posted
>>> +		 * is reasserted and our task rewoken (which causes us to
>>> +		 * do another __i915_request_irq_complete() immediately
>>> +		 * and reapply the barrier). Conversely, if the clear
>>> +		 * occurs after the barrier, then an interrupt that arrived
>>> +		 * whilst we waited on the barrier would not trigger a
>>> +		 * barrier on the next pass, and the read may not see the
>>> +		 * seqno update.
>>> +		 */
>>> +		WRITE_ONCE(engine->irq_posted, false);
>>
>> Why is this not smp_store_mb ?
>
> We only require the ordering wrt to irq_seqno_barrier().
>
> How about:
>
> if (engine->irq_seqno_barrier &&
>      cmpxchg_relaxed(&engine->irq_post, 1, 0)) {
>
> Less shouty?

I think so.

>>> diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
>>> index 44346de39794..0f5fe114c204 100644
>>> --- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
>>> +++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
>>> @@ -43,12 +43,18 @@ static void intel_breadcrumbs_fake_irq(unsigned long data)
>>>
>>>   static void irq_enable(struct intel_engine_cs *engine)
>>>   {
>>> +	/* Enabling the IRQ may miss the generation of the interrupt, but
>>> +	 * we still need to force the barrier before reading the seqno,
>>> +	 * just in case.
>>> +	 */
>>> +	engine->irq_posted = true;
>>
>> Should it be smp_store_mb here as well?
>
> No, this is written/read on the same callchain.

Ah true.

Regards,

Tvrtko