[Intel-gfx] [PATCH 05/20] drm/i915: Support for GuC interrupts

Fri Aug 12 14:31:52 UTC 2016

On 8/12/2016 7:01 PM, Tvrtko Ursulin wrote:
>
> On 12/08/16 14:10, Goel, Akash wrote:
>> On 8/12/2016 5:24 PM, Tvrtko Ursulin wrote:
>>>
>>> On 12/08/16 07:25, akash.goel at intel.com wrote:
>>>> From: Sagar Arun Kamble <sagar.a.kamble at intel.com>
>>>>
>>>> There are certain types of interrupts which Host can recieve from GuC.
>>>> GuC ukernel sends an interrupt to Host for certain events, like for
>>>> example retrieve/consume the logs generated by ukernel.
>>>> This patch adds support to receive interrupts from GuC but currently
>>>> enables & partially handles only the interrupt sent by GuC ukernel.
>>>> Future patches will add support for handling other interrupt types.
>>>>
>>>> v2:
>>>> - Use common low level routines for PM IER/IIR programming (Chris)
>>>> - Rename interrupt functions to gen9_xxx from gen8_xxx (Chris)
>>>> - Replace disabling of wake ref asserts with rpm get/put (Chris)
>>>>
>>>> v3:
>>>> - Update comments for more clarity. (Tvrtko)
>>>> - Remove the masking of GuC interrupt, which was kept masked till the
>>>>    start of bottom half, its not really needed as there is only a
>>>>    single instance of work item & wq is ordered. (Tvrtko)
>>>>
>>>> v4:
>>>> - Rebase.
>>>> - Rename guc_events to pm_guc_events so as to be indicative of the
>>>>    register/control block it is associated with. (Chris)
>>>> - Add handling for back to back log buffer flush interrupts.
>>>>
>>>> v5:
>>>> - Move the read & clearing of register, containing Guc2Host message
>>>>    bits, outside the irq spinlock. (Tvrtko)
>>>>
>>>> Signed-off-by: Sagar Arun Kamble <sagar.a.kamble at intel.com>
>>>> Signed-off-by: Akash Goel <akash.goel at intel.com>
>>>> ---
>>>>   drivers/gpu/drm/i915/i915_drv.h            |   1 +
>>>>   drivers/gpu/drm/i915/i915_guc_submission.c |   5 ++
>>>>   drivers/gpu/drm/i915/i915_irq.c            | 100
>>>> +++++++++++++++++++++++++++--
>>>>   drivers/gpu/drm/i915/i915_reg.h            |  11 ++++
>>>>   drivers/gpu/drm/i915/intel_drv.h           |   3 +
>>>>   drivers/gpu/drm/i915/intel_guc.h           |   4 ++
>>>>   drivers/gpu/drm/i915/intel_guc_loader.c    |   4 ++
>>>>   7 files changed, 124 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h
>>>> b/drivers/gpu/drm/i915/i915_drv.h
>>>> index a608a5c..28ffac5 100644
>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>> @@ -1779,6 +1779,7 @@ struct drm_i915_private {
>>>>       u32 pm_imr;
>>>>       u32 pm_ier;
>>>>       u32 pm_rps_events;
>>>> +    u32 pm_guc_events;
>>>>       u32 pipestat_irq_mask[I915_MAX_PIPES];
>>>>
>>>>       struct i915_hotplug hotplug;
>>>> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c
>>>> b/drivers/gpu/drm/i915/i915_guc_submission.c
>>>> index ad3b55f..c7c679f 100644
>>>> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
>>>> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
>>>> @@ -1071,6 +1071,8 @@ int intel_guc_suspend(struct drm_device *dev)
>>>>       if (guc->guc_fw.guc_fw_load_status != GUC_FIRMWARE_SUCCESS)
>>>>           return 0;
>>>>
>>>> +    gen9_disable_guc_interrupts(dev_priv);
>>>> +
>>>>       ctx = dev_priv->kernel_context;
>>>>
>>>>       data[0] = HOST2GUC_ACTION_ENTER_S_STATE;
>>>> @@ -1097,6 +1099,9 @@ int intel_guc_resume(struct drm_device *dev)
>>>>       if (guc->guc_fw.guc_fw_load_status != GUC_FIRMWARE_SUCCESS)
>>>>           return 0;
>>>>
>>>> +    if (i915.guc_log_level >= 0)
>>>> +        gen9_enable_guc_interrupts(dev_priv);
>>>> +
>>>>       ctx = dev_priv->kernel_context;
>>>>
>>>>       data[0] = HOST2GUC_ACTION_EXIT_S_STATE;
>>>> diff --git a/drivers/gpu/drm/i915/i915_irq.c
>>>> b/drivers/gpu/drm/i915/i915_irq.c
>>>> index 5f93309..5f1974f 100644
>>>> --- a/drivers/gpu/drm/i915/i915_irq.c
>>>> +++ b/drivers/gpu/drm/i915/i915_irq.c
>>>> @@ -170,6 +170,7 @@ static void gen5_assert_iir_is_zero(struct
>>>> drm_i915_private *dev_priv,
>>>>   } while (0)
>>>>
>>>>   static void gen6_rps_irq_handler(struct drm_i915_private *dev_priv,
>>>> u32 pm_iir);
>>>> +static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv,
>>>> u32 pm_iir);
>>>>
>>>>   /* For display hotplug interrupt */
>>>>   static inline void
>>>> @@ -411,6 +412,38 @@ void gen6_disable_rps_interrupts(struct
>>>> drm_i915_private *dev_priv)
>>>>       gen6_reset_rps_interrupts(dev_priv);
>>>>   }
>>>>
>>>> +void gen9_reset_guc_interrupts(struct drm_i915_private *dev_priv)
>>>> +{
>>>> +    spin_lock_irq(&dev_priv->irq_lock);
>>>> +    gen6_reset_pm_iir(dev_priv, dev_priv->pm_guc_events);
>>>> +    spin_unlock_irq(&dev_priv->irq_lock);
>>>> +}
>>>> +
>>>> +void gen9_enable_guc_interrupts(struct drm_i915_private *dev_priv)
>>>> +{
>>>> +    spin_lock_irq(&dev_priv->irq_lock);
>>>> +    if (!dev_priv->guc.interrupts_enabled) {
>>>> +        WARN_ON_ONCE(I915_READ(gen6_pm_iir(dev_priv)) &
>>>> +                        dev_priv->pm_guc_events);
>>>> +        dev_priv->guc.interrupts_enabled = true;
>>>> +        gen6_enable_pm_irq(dev_priv, dev_priv->pm_guc_events);
>>>> +    }
>>>> +    spin_unlock_irq(&dev_priv->irq_lock);
>>>> +}
>>>> +
>>>> +void gen9_disable_guc_interrupts(struct drm_i915_private *dev_priv)
>>>> +{
>>>> +    spin_lock_irq(&dev_priv->irq_lock);
>>>> +    dev_priv->guc.interrupts_enabled = false;
>>>> +
>>>> +    gen6_disable_pm_irq(dev_priv, dev_priv->pm_guc_events);
>>>> +
>>>> +    spin_unlock_irq(&dev_priv->irq_lock);
>>>> +    synchronize_irq(dev_priv->drm.irq);
>>>> +
>>>> +    gen9_reset_guc_interrupts(dev_priv);
>>>> +}
>>>> +
>>>>   /**
>>>>    * bdw_update_port_irq - update DE port interrupt
>>>>    * @dev_priv: driver private
>>>> @@ -1167,6 +1200,21 @@ static void gen6_pm_rps_work(struct work_struct
>>>> *work)
>>>>       mutex_unlock(&dev_priv->rps.hw_lock);
>>>>   }
>>>>
>>>> +static void gen9_guc2host_events_work(struct work_struct *work)
>>>> +{
>>>> +    struct drm_i915_private *dev_priv =
>>>> +        container_of(work, struct drm_i915_private, guc.events_work);
>>>> +
>>>> +    spin_lock_irq(&dev_priv->irq_lock);
>>>> +    /* Speed up work cancellation during disabling guc interrupts. */
>>>> +    if (!dev_priv->guc.interrupts_enabled) {
>>>> +        spin_unlock_irq(&dev_priv->irq_lock);
>>>> +        return;
>>>
>>> I suppose locking for early exit is something about ensuring the worker
>>> sees the update to dev_priv->guc.interrupts_enabled done on another CPU?
>>
>> Yes locking (providing implicit barrier) will ensure that update made
>> from another CPU is immediately visible to the worker.
>
> What if the disable happens after the unlock above? It would wait in
> disable until the irq handler exits.
Most probably it will not have to wait, as irq handler would have 
completed if work item began the execution.
Irq handler just queues the work item, which gets scheduled later on.

Using the lock is beneficial for the case where the execution of work 
item and interrupt disabling is done around the same time.

> So the same as if not bothering
> with the spinlock above, no?
>
>>> synchronize_irq there is not enough for some reason?
>>>
>> synchronize_irq would not be enough, its for a different purpose, to
>> ensure that any ongoing handling of irq completes (after the caller has
>> disabled the irq).
>>
>> As per my understanding synchronize_irq won't have an effect on the
>> worker, with respect to the moment when the update of
>> 'interrupts_enabled' flag is visible to the worker.
>>
>>>> +    }
>>>> +    spin_unlock_irq(&dev_priv->irq_lock);
>>>> +
>>>> +    /* TODO: Handle the events for which GuC interrupted host */
>>>> +}
>>>>
>>>>   /**
>>>>    * ivybridge_parity_work - Workqueue called when a parity error
>>>> interrupt
>>>> @@ -1339,11 +1387,13 @@ static irqreturn_t gen8_gt_irq_ack(struct
>>>> drm_i915_private *dev_priv,
>>>>               DRM_ERROR("The master control interrupt lied (GT3)!\n");
>>>>       }
>>>>
>>>> -    if (master_ctl & GEN8_GT_PM_IRQ) {
>>>> +    if (master_ctl & (GEN8_GT_PM_IRQ | GEN8_GT_GUC_IRQ)) {
>>>>           gt_iir[2] = I915_READ_FW(GEN8_GT_IIR(2));
>>>> -        if (gt_iir[2] & dev_priv->pm_rps_events) {
>>>> +        if (gt_iir[2] & (dev_priv->pm_rps_events |
>>>> +                 dev_priv->pm_guc_events)) {
>>>>               I915_WRITE_FW(GEN8_GT_IIR(2),
>>>> -                      gt_iir[2] & dev_priv->pm_rps_events);
>>>> +                      gt_iir[2] & (dev_priv->pm_rps_events |
>>>> +                           dev_priv->pm_guc_events));
>>>>               ret = IRQ_HANDLED;
>>>>           } else
>>>>               DRM_ERROR("The master control interrupt lied (PM)!\n");
>>>> @@ -1375,6 +1425,9 @@ static void gen8_gt_irq_handler(struct
>>>> drm_i915_private *dev_priv,
>>>>
>>>>       if (gt_iir[2] & dev_priv->pm_rps_events)
>>>>           gen6_rps_irq_handler(dev_priv, gt_iir[2]);
>>>> +
>>>> +    if (gt_iir[2] & dev_priv->pm_guc_events)
>>>> +        gen9_guc_irq_handler(dev_priv, gt_iir[2]);
>>>>   }
>>>>
>>>>   static bool bxt_port_hotplug_long_detect(enum port port, u32 val)
>>>> @@ -1621,6 +1674,41 @@ static void gen6_rps_irq_handler(struct
>>>> drm_i915_private *dev_priv, u32 pm_iir)
>>>>       }
>>>>   }
>>>>
>>>> +static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv,
>>>> u32 gt_iir)
>>>> +{
>>>> +    bool interrupts_enabled;
>>>> +
>>>> +    if (gt_iir & GEN9_GUC_TO_HOST_INT_EVENT) {
>>>> +        spin_lock(&dev_priv->irq_lock);
>>>> +        interrupts_enabled = dev_priv->guc.interrupts_enabled;
>>>> +        spin_unlock(&dev_priv->irq_lock);
>>>
>>> Not sure that taking a lock around only this read is needed.
>>>
>> Again same reason as above, to make sure an update made on another CPU
>> is immediately visible to the irq handler.
>
> I don't get it, see above. :)

Here also If interrupt disabling & ISR execution happens around the same 
time then ISR might miss the reset of 'interrupts_enabled' flag and 
queue the new work.
And same applies to the case when interrupt is re-enabled, ISR might 
still see the 'interrupts_enabled' flag as false.
It will eventually see the update though.

>
>>>> +        if (interrupts_enabled) {
>>>> +            /* Sample the log buffer flush related bits & clear them
>>>> +             * out now itself from the message identity register to
>>>> +             * minimize the probability of losing a flush interrupt,
>>>> +             * when there are back to back flush interrupts.
>>>> +             * There can be a new flush interrupt, for different log
>>>> +             * buffer type (like for ISR), whilst Host is handling
>>>> +             * one (for DPC). Since same bit is used in message
>>>> +             * register for ISR & DPC, it could happen that GuC
>>>> +             * sets the bit for 2nd interrupt but Host clears out
>>>> +             * the bit on handling the 1st interrupt.
>>>> +             */
>>>> +            u32 msg = I915_READ(SOFT_SCRATCH(15)) &
>>>> +                    (GUC2HOST_MSG_CRASH_DUMP_POSTED |
>>>> +                     GUC2HOST_MSG_FLUSH_LOG_BUFFER);
>>>> +            if (msg) {
>>>> +                /* Clear the message bits that are handled */
>>>> +                I915_WRITE(SOFT_SCRATCH(15),
>>>> +                    I915_READ(SOFT_SCRATCH(15)) & ~msg);
>>>
>>> Cache full value of SOFT_SCRATCH(15) so you don't have to mmio read it
>>> twice?
>>>
>> Thought reading it again (just before the update) is bit safer compared
>> to reading it once, as there is a potential race problem here.
>> GuC could also write to the SOFT_SCRATCH(15) register, set new events
>> bit, while Host clears off the bit of handled events.
>
> Don't get it. If there is a race between read and write there still is,
> don't see how a second read makes it safer.
>
Yes can't avoid the race completely by double reads, but can reduce the 
race window size.

Also I felt code looked better in current form, as macros
GUC2HOST_MSG_CRASH_DUMP_POSTED & GUC2HOST_MSG_FLUSH_LOG_BUFFER were used 
only once.

Will change as per the initial implementation.

	u32 msg = I915_READ(SOFT_SCRATCH(15));
	if (msg & (GUC2HOST_MSG_CRASH_DUMP_POSTED |
		   GUC2HOST_MSG_FLUSH_LOG_BUFFER) {
		msg &= ~(GUC2HOST_MSG_CRASH_DUMP_POSTED |
			 GUC2HOST_MSG_FLUSH_LOG_BUFFER);
		I915_WRITE(SOFT_SCRATCH(15), msg);
	}

>>> Also, is the RMW outside any locks safe?
>>>
>>
>> Ideally need a way to atomically do the RMW, i.e. read the register
>> value, clear off the handled events bit and update the register with the
>> modified value.
>>
>> Please kindly suggest how to address the above.
>> Or can this be left as a TODO, when we do start handling other events
>> also.
>
> From the comment in code above it sounds like a GuC fw interface
> shortcoming - that there is a single bit for two different interrupt
> sources, is that right?
Yes that shortcoming is there, GUC2HOST_MSG_FLUSH_LOG_BUFFER bit is used 
for conveying the flush for ISR & DPC log buffers.

> Is there any other register or something that
> you can read to detect that the interrupt has been re-asserted while in
> the irq handler?

> Although I thought you said before that the GuC will
> not do that - that it won't re-assert the interrupt before we send the
> flush command.
Yes that is the case, but with respect to one type of a log buffer, like 
for example unless GuC firmware receives the ack for DPC log
buffer it won't send a new flush for DPC buffer, but if meanwhile ISR 
buffer becomes half full it will send a flush interrupt.

>
>>>> +
>>>> +                /* Handle flush interrupt event in bottom half */
>>>> +                queue_work(dev_priv->wq, &dev_priv->guc.events_work);
>>>
>>> IMHO it would be nicer if the code started straight away with a final wq
>>> solution.
>>>
>>> Especially since the next patch in the series is called "Handle log
>>> buffer flush interrupt event from GuC" and the actual handling of the
>>> log buffer flush interrupt is split between this one
>>> (GUC2HOST_MSG_FLUSH_LOG_BUFFER above) and that one.
>>>
>>> So it would almost be nicer that the above chunk which handles
>>> GUC2HOST_MSG_FLUSH_LOG_BUFFER and the worker init is only added in the
>>> next patch and this one only does the generic bits.
>>>
>>
>> Fine will move the log buffer flush interrupt event related stuff to the
>> next patch and so irq handler in this patch will just be a
>> placeholder.
>
> Great thanks!
>
>>> I don't know.. I'll leave it on your conscience - if you think the split
>>> (series) can't be done any nicer or it makes sense to have it in this
>>> order then ok.
>>>
>>>> +            }
>>>
>>> Mabye:
>>>
>>>     } else
>>>
>>> And log something unexpected has happened in the !msg case?
>>>
>>> Since it won't clear the message in that case so would it keep
>>> triggering?
>>>
>>
>> Actually after enabling of GuC interrupt, there can be interrupts from
>> GuC side for some other events which are right now not handled by Host.
>>
>> But not clearing of unhandled event bits won't result in re-triggering
>> of the interrupt.
>
> Ok I suggest documenting that as a comment in code then.
>
Fine will add a comment in the else case.
> Regards,
>
> Tvrtko