[Intel-gfx] [PATCH 05/20] drm/i915: Support for GuC interrupts

Fri Aug 12 15:05:09 UTC 2016

On 12/08/16 15:31, Goel, Akash wrote:
> On 8/12/2016 7:01 PM, Tvrtko Ursulin wrote:
>>>>> +static void gen9_guc2host_events_work(struct work_struct *work)
>>>>> +{
>>>>> +    struct drm_i915_private *dev_priv =
>>>>> +        container_of(work, struct drm_i915_private, guc.events_work);
>>>>> +
>>>>> +    spin_lock_irq(&dev_priv->irq_lock);
>>>>> +    /* Speed up work cancellation during disabling guc interrupts. */
>>>>> +    if (!dev_priv->guc.interrupts_enabled) {
>>>>> +        spin_unlock_irq(&dev_priv->irq_lock);
>>>>> +        return;
>>>>
>>>> I suppose locking for early exit is something about ensuring the worker
>>>> sees the update to dev_priv->guc.interrupts_enabled done on another
>>>> CPU?
>>>
>>> Yes locking (providing implicit barrier) will ensure that update made
>>> from another CPU is immediately visible to the worker.
>>
>> What if the disable happens after the unlock above? It would wait in
>> disable until the irq handler exits.
> Most probably it will not have to wait, as irq handler would have
> completed if work item began the execution.
> Irq handler just queues the work item, which gets scheduled later on.
>
> Using the lock is beneficial for the case where the execution of work
> item and interrupt disabling is done around the same time.

Ok maybe I am missing something.

When can the interrupt disabling happen? Will it be controlled by the 
debugfs file or is it driver load/unload and suspend/resume?

>>>>> +static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv,
>>>>> u32 gt_iir)
>>>>> +{
>>>>> +    bool interrupts_enabled;
>>>>> +
>>>>> +    if (gt_iir & GEN9_GUC_TO_HOST_INT_EVENT) {
>>>>> +        spin_lock(&dev_priv->irq_lock);
>>>>> +        interrupts_enabled = dev_priv->guc.interrupts_enabled;
>>>>> +        spin_unlock(&dev_priv->irq_lock);
>>>>
>>>> Not sure that taking a lock around only this read is needed.
>>>>
>>> Again same reason as above, to make sure an update made on another CPU
>>> is immediately visible to the irq handler.
>>
>> I don't get it, see above. :)
>
> Here also If interrupt disabling & ISR execution happens around the same
> time then ISR might miss the reset of 'interrupts_enabled' flag and
> queue the new work.

What if reset of interrupts_enabled happens just as the ISR releases the 
lock?

> And same applies to the case when interrupt is re-enabled, ISR might
> still see the 'interrupts_enabled' flag as false.
> It will eventually see the update though.
>
>>
>>>>> +        if (interrupts_enabled) {
>>>>> +            /* Sample the log buffer flush related bits & clear them
>>>>> +             * out now itself from the message identity register to
>>>>> +             * minimize the probability of losing a flush interrupt,
>>>>> +             * when there are back to back flush interrupts.
>>>>> +             * There can be a new flush interrupt, for different log
>>>>> +             * buffer type (like for ISR), whilst Host is handling
>>>>> +             * one (for DPC). Since same bit is used in message
>>>>> +             * register for ISR & DPC, it could happen that GuC
>>>>> +             * sets the bit for 2nd interrupt but Host clears out
>>>>> +             * the bit on handling the 1st interrupt.
>>>>> +             */
>>>>> +            u32 msg = I915_READ(SOFT_SCRATCH(15)) &
>>>>> +                    (GUC2HOST_MSG_CRASH_DUMP_POSTED |
>>>>> +                     GUC2HOST_MSG_FLUSH_LOG_BUFFER);
>>>>> +            if (msg) {
>>>>> +                /* Clear the message bits that are handled */
>>>>> +                I915_WRITE(SOFT_SCRATCH(15),
>>>>> +                    I915_READ(SOFT_SCRATCH(15)) & ~msg);
>>>>
>>>> Cache full value of SOFT_SCRATCH(15) so you don't have to mmio read it
>>>> twice?
>>>>
>>> Thought reading it again (just before the update) is bit safer compared
>>> to reading it once, as there is a potential race problem here.
>>> GuC could also write to the SOFT_SCRATCH(15) register, set new events
>>> bit, while Host clears off the bit of handled events.
>>
>> Don't get it. If there is a race between read and write there still is,
>> don't see how a second read makes it safer.
>>
> Yes can't avoid the race completely by double reads, but can reduce the
> race window size.

There was only one thing between the two reads, and that was "if (msg)":

  +            u32 msg = I915_READ(SOFT_SCRATCH(15)) &
  +                    (GUC2HOST_MSG_CRASH_DUMP_POSTED |
  +                     GUC2HOST_MSG_FLUSH_LOG_BUFFER);

  +            if (msg) {

  +                /* Clear the message bits that are handled */
  +                I915_WRITE(SOFT_SCRATCH(15),
  +                    I915_READ(SOFT_SCRATCH(15)) & ~msg);

>
> Also I felt code looked better in current form, as macros
> GUC2HOST_MSG_CRASH_DUMP_POSTED & GUC2HOST_MSG_FLUSH_LOG_BUFFER were used
> only once.
>
> Will change as per the initial implementation.
>
>      u32 msg = I915_READ(SOFT_SCRATCH(15));
>      if (msg & (GUC2HOST_MSG_CRASH_DUMP_POSTED |
>             GUC2HOST_MSG_FLUSH_LOG_BUFFER) {
>          msg &= ~(GUC2HOST_MSG_CRASH_DUMP_POSTED |
>               GUC2HOST_MSG_FLUSH_LOG_BUFFER);
>          I915_WRITE(SOFT_SCRATCH(15), msg);
>      }

Or:
	u32 msg, flush;

	msg = I915_READ(SOFT_SCRATCH(15));
	flush = msg & (GUC2HOST_MSG_CRASH_DUMP_POSTED | 
GUC2HOST_MSG_FLUSH_LOG_BUFFER);
	if (flush) {
		I915_WRITE(SOFT_SCRATCH(15), ~flush);
		... queue woker ...

?

>
>>>> Also, is the RMW outside any locks safe?
>>>>
>>>
>>> Ideally need a way to atomically do the RMW, i.e. read the register
>>> value, clear off the handled events bit and update the register with the
>>> modified value.
>>>
>>> Please kindly suggest how to address the above.
>>> Or can this be left as a TODO, when we do start handling other events
>>> also.
>>
>> From the comment in code above it sounds like a GuC fw interface
>> shortcoming - that there is a single bit for two different interrupt
>> sources, is that right?
> Yes that shortcoming is there, GUC2HOST_MSG_FLUSH_LOG_BUFFER bit is used
> for conveying the flush for ISR & DPC log buffers.
>
>> Is there any other register or something that
>> you can read to detect that the interrupt has been re-asserted while in
>> the irq handler?
>
>
>> Although I thought you said before that the GuC will
>> not do that - that it won't re-assert the interrupt before we send the
>> flush command.
> Yes that is the case, but with respect to one type of a log buffer, like
> for example unless GuC firmware receives the ack for DPC log
> buffer it won't send a new flush for DPC buffer, but if meanwhile ISR
> buffer becomes half full it will send a flush interrupt.

So the potential for losing interrupts is unavoidable it seems. :( If I 
am understanding this correctly.

Regards,

Tvrtko