[Intel-gfx] [PATCH 05/20] drm/i915: Support for GuC interrupts

Goel, Akash akash.goel at intel.com
Fri Aug 12 15:32:38 UTC 2016



On 8/12/2016 8:35 PM, Tvrtko Ursulin wrote:
>
> On 12/08/16 15:31, Goel, Akash wrote:
>> On 8/12/2016 7:01 PM, Tvrtko Ursulin wrote:
>>>>>> +static void gen9_guc2host_events_work(struct work_struct *work)
>>>>>> +{
>>>>>> +    struct drm_i915_private *dev_priv =
>>>>>> +        container_of(work, struct drm_i915_private,
>>>>>> guc.events_work);
>>>>>> +
>>>>>> +    spin_lock_irq(&dev_priv->irq_lock);
>>>>>> +    /* Speed up work cancellation during disabling guc
>>>>>> interrupts. */
>>>>>> +    if (!dev_priv->guc.interrupts_enabled) {
>>>>>> +        spin_unlock_irq(&dev_priv->irq_lock);
>>>>>> +        return;
>>>>>
>>>>> I suppose locking for early exit is something about ensuring the
>>>>> worker
>>>>> sees the update to dev_priv->guc.interrupts_enabled done on another
>>>>> CPU?
>>>>
>>>> Yes locking (providing implicit barrier) will ensure that update made
>>>> from another CPU is immediately visible to the worker.
>>>
>>> What if the disable happens after the unlock above? It would wait in
>>> disable until the irq handler exits.
>> Most probably it will not have to wait, as irq handler would have
>> completed if work item began the execution.
>> Irq handler just queues the work item, which gets scheduled later on.
>>
>> Using the lock is beneficial for the case where the execution of work
>> item and interrupt disabling is done around the same time.
>
> Ok maybe I am missing something.
>
> When can the interrupt disabling happen? Will it be controlled by the
> debugfs file or is it driver load/unload and suspend/resume?
>
yes disabling will happen for all the above 3 scenarios.

>>>>>> +static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv,
>>>>>> u32 gt_iir)
>>>>>> +{
>>>>>> +    bool interrupts_enabled;
>>>>>> +
>>>>>> +    if (gt_iir & GEN9_GUC_TO_HOST_INT_EVENT) {
>>>>>> +        spin_lock(&dev_priv->irq_lock);
>>>>>> +        interrupts_enabled = dev_priv->guc.interrupts_enabled;
>>>>>> +        spin_unlock(&dev_priv->irq_lock);
>>>>>
>>>>> Not sure that taking a lock around only this read is needed.
>>>>>
>>>> Again same reason as above, to make sure an update made on another CPU
>>>> is immediately visible to the irq handler.
>>>
>>> I don't get it, see above. :)
>>
>> Here also If interrupt disabling & ISR execution happens around the same
>> time then ISR might miss the reset of 'interrupts_enabled' flag and
>> queue the new work.
>
> What if reset of interrupts_enabled happens just as the ISR releases the
> lock?
>
Then ISR will proceed ahead and queue the work item.

Lock is useful if reset of interrupts_enabled flag just happens before 
the ISR inspects the value of that flag.
Also lock will help when interrupts_enabled flag is set again, next ISR 
will definitely see it as set.

>> And same applies to the case when interrupt is re-enabled, ISR might
>> still see the 'interrupts_enabled' flag as false.
>> It will eventually see the update though.
>>
>>>
>>>>>> +        if (interrupts_enabled) {
>>>>>> +            /* Sample the log buffer flush related bits & clear them
>>>>>> +             * out now itself from the message identity register to
>>>>>> +             * minimize the probability of losing a flush interrupt,
>>>>>> +             * when there are back to back flush interrupts.
>>>>>> +             * There can be a new flush interrupt, for different log
>>>>>> +             * buffer type (like for ISR), whilst Host is handling
>>>>>> +             * one (for DPC). Since same bit is used in message
>>>>>> +             * register for ISR & DPC, it could happen that GuC
>>>>>> +             * sets the bit for 2nd interrupt but Host clears out
>>>>>> +             * the bit on handling the 1st interrupt.
>>>>>> +             */
>>>>>> +            u32 msg = I915_READ(SOFT_SCRATCH(15)) &
>>>>>> +                    (GUC2HOST_MSG_CRASH_DUMP_POSTED |
>>>>>> +                     GUC2HOST_MSG_FLUSH_LOG_BUFFER);
>>>>>> +            if (msg) {
>>>>>> +                /* Clear the message bits that are handled */
>>>>>> +                I915_WRITE(SOFT_SCRATCH(15),
>>>>>> +                    I915_READ(SOFT_SCRATCH(15)) & ~msg);
>>>>>
>>>>> Cache full value of SOFT_SCRATCH(15) so you don't have to mmio read it
>>>>> twice?
>>>>>
>>>> Thought reading it again (just before the update) is bit safer compared
>>>> to reading it once, as there is a potential race problem here.
>>>> GuC could also write to the SOFT_SCRATCH(15) register, set new events
>>>> bit, while Host clears off the bit of handled events.
>>>
>>> Don't get it. If there is a race between read and write there still is,
>>> don't see how a second read makes it safer.
>>>
>> Yes can't avoid the race completely by double reads, but can reduce the
>> race window size.
>
> There was only one thing between the two reads, and that was "if (msg)":
>
>  +            u32 msg = I915_READ(SOFT_SCRATCH(15)) &
>  +                    (GUC2HOST_MSG_CRASH_DUMP_POSTED |
>  +                     GUC2HOST_MSG_FLUSH_LOG_BUFFER);
>
>  +            if (msg) {
>
>  +                /* Clear the message bits that are handled */
>  +                I915_WRITE(SOFT_SCRATCH(15),
>  +                    I915_READ(SOFT_SCRATCH(15)) & ~msg);
>
>>
>> Also I felt code looked better in current form, as macros
>> GUC2HOST_MSG_CRASH_DUMP_POSTED & GUC2HOST_MSG_FLUSH_LOG_BUFFER were used
>> only once.
>>
>> Will change as per the initial implementation.
>>
>>      u32 msg = I915_READ(SOFT_SCRATCH(15));
>>      if (msg & (GUC2HOST_MSG_CRASH_DUMP_POSTED |
>>             GUC2HOST_MSG_FLUSH_LOG_BUFFER) {
>>          msg &= ~(GUC2HOST_MSG_CRASH_DUMP_POSTED |
>>               GUC2HOST_MSG_FLUSH_LOG_BUFFER);
>>          I915_WRITE(SOFT_SCRATCH(15), msg);
>>      }
>
> Or:
>     u32 msg, flush;
>
>     msg = I915_READ(SOFT_SCRATCH(15));
>     flush = msg & (GUC2HOST_MSG_CRASH_DUMP_POSTED |
> GUC2HOST_MSG_FLUSH_LOG_BUFFER);
>     if (flush) {
>         I915_WRITE(SOFT_SCRATCH(15), ~flush);
>         ... queue woker ...
>
> ?
Thanks much, will change as per the above.

>
>>
>>>>> Also, is the RMW outside any locks safe?
>>>>>
>>>>
>>>> Ideally need a way to atomically do the RMW, i.e. read the register
>>>> value, clear off the handled events bit and update the register with
>>>> the
>>>> modified value.
>>>>
>>>> Please kindly suggest how to address the above.
>>>> Or can this be left as a TODO, when we do start handling other events
>>>> also.
>>>
>>> From the comment in code above it sounds like a GuC fw interface
>>> shortcoming - that there is a single bit for two different interrupt
>>> sources, is that right?
>> Yes that shortcoming is there, GUC2HOST_MSG_FLUSH_LOG_BUFFER bit is used
>> for conveying the flush for ISR & DPC log buffers.
>>
>>> Is there any other register or something that
>>> you can read to detect that the interrupt has been re-asserted while in
>>> the irq handler?
>>
>>
>>> Although I thought you said before that the GuC will
>>> not do that - that it won't re-assert the interrupt before we send the
>>> flush command.
>> Yes that is the case, but with respect to one type of a log buffer, like
>> for example unless GuC firmware receives the ack for DPC log
>> buffer it won't send a new flush for DPC buffer, but if meanwhile ISR
>> buffer becomes half full it will send a flush interrupt.
>
> So the potential for losing interrupts is unavoidable it seems. :( If I
> am understanding this correctly.
>
Yes unavoidable, but that's a very small window.
Have apprised GuC folks about this.

Best Regards
Akash

> Regards,
>
> Tvrtko


More information about the Intel-gfx mailing list