[Intel-gfx] [PATCH 19/19] drm/i915: Sync against the GuC log buffer flush work item on system suspend
Goel, Akash
akash.goel at intel.com
Wed Aug 17 15:37:17 UTC 2016
On 8/17/2016 6:41 PM, Imre Deak wrote:
> On ke, 2016-08-17 at 18:15 +0530, Goel, Akash wrote:
>>
>> On 8/17/2016 5:11 PM, Chris Wilson wrote:
>>> On Wed, Aug 17, 2016 at 12:27:30PM +0100, Tvrtko Ursulin wrote:
>>>>
>>>> On 17/08/16 11:14, akash.goel at intel.com wrote:
>>>>> From: Akash Goel <akash.goel at intel.com>
>>>>>
>>>>> The GuC log buffer flush work item does a register access to send the ack
>>>>> to GuC and this work item, if not synced before suspend, can potentially
>>>>> get executed after the GFX device is suspended.
>>>>> The work item function uses rpm_get/rpm_put calls around the Hw access,
>>>>> this covers the runtime suspend case but for system suspend case (which can
>>>>> be done asychronously/forcefully) sync would be required as kernel can
>>>>> potentially schedule the work items even after some devices, including GFX,
>>>>> have been put to suspend.
>>>>> Also sync has to be done conditionally i.e. only for the system suspend
>>>>> case, as sync along with rpm_get/rpm_put calls can cause a deadlock for rpm
>>>>> suspend path.
>>>>>
>>>>> Cc: Imre Deak <imre.deak at intel.com>
>>>>> Signed-off-by: Akash Goel <akash.goel at intel.com>
>>>>> ---
>>>>> drivers/gpu/drm/i915/i915_drv.c | 4 ++--
>>>>> drivers/gpu/drm/i915/i915_guc_submission.c | 8 +++++++-
>>>>> drivers/gpu/drm/i915/intel_guc.h | 2 +-
>>>>> 3 files changed, 10 insertions(+), 4 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
>>>>> index cdee60b..2ae0ad4 100644
>>>>> --- a/drivers/gpu/drm/i915/i915_drv.c
>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.c
>>>>> @@ -1427,7 +1427,7 @@ static int i915_drm_suspend(struct drm_device *dev)
>>>>> goto out;
>>>>> }
>>>>>
>>>>> - intel_guc_suspend(dev);
>>>>> + intel_guc_suspend(dev, false);
>>>>>
>>>>> intel_display_suspend(dev);
>>>>>
>>>>> @@ -2321,7 +2321,7 @@ static int intel_runtime_suspend(struct device *device)
>>>>> i915_gem_release_all_mmaps(dev_priv);
>>>>> mutex_unlock(&dev->struct_mutex);
>>>>>
>>>>> - intel_guc_suspend(dev);
>>>>> + intel_guc_suspend(dev, true);
>>>>>
>>>>> intel_runtime_pm_disable_interrupts(dev_priv);
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
>>>>> index ef0c116..1af8a8b 100644
>>>>> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
>>>>> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
>>>>> @@ -1519,7 +1519,7 @@ void i915_guc_submission_fini(struct drm_i915_private *dev_priv)
>>>>> * intel_guc_suspend() - notify GuC entering suspend state
>>>>> * @dev: drm device
>>>>> */
>>>>> -int intel_guc_suspend(struct drm_device *dev)
>>>>> +int intel_guc_suspend(struct drm_device *dev, bool rpm_suspend)
>>>>> {
>>>>> struct drm_i915_private *dev_priv = to_i915(dev);
>>>>> struct intel_guc *guc = &dev_priv->guc;
>>>>> @@ -1530,6 +1530,12 @@ int intel_guc_suspend(struct drm_device *dev)
>>>>> return 0;
>>>>>
>>>>> gen9_disable_guc_interrupts(dev_priv);
>>>>> + /* Sync is needed only for the system suspend case, runtime suspend
>>>>> + * case is covered due to rpm get/put calls used around Hw access in
>>>>> + * the work item function.
>>>>> + */
>>>>> + if (!rpm_suspend && (i915.guc_log_level >= 0))
>>>>> + flush_work(&dev_priv->guc.log.flush_work);
>>>
>>> In which case (rpm suspend) the flush_work is idle and this a noop. That
>>> you have to pass around such state suggests that you are papering over a
>>> bug?
>> In case of rpm suspend the flush_work may not be a NOOP.
>> Can use the flush_work for runtime suspend also but in spite of that
>> can't prevent the 'RPM wakelock' asserts, as the work item can get
>> executed after the rpm ref count drops to zero and before runtime
>> suspend kicks in (after autosuspend delay).
>>
>> For that you had earlier suggested to use rpm get/put in the work item
>> function, around the register access, but with that had to remove the
>> flush_work from the suspend hook, otherwise a deadlock can happen.
>> So doing the flush_work conditionally for system suspend case, as rpm
>> get/put won't cause the resume of device in that case.
>>
>> Actually I had discussed about this with Imre and as per his inputs
>> prepared this patch.
>
> There would be this alternative:
>
Thanks much for suggesting the alternate approach.
Just to confirm whether I understood everything correctly,
> in gen9_guc_irq_handler():
> WARN_ON(!intel_runtime_pm_get_if_in_use());
Used WARN, as we don't expect the device to be suspended at this
juncture, so intel_runtime_pm_get_if_in_use() should return true.
> if (!queue_work(log.flush_work))
If queue_work returns 0, then work item is already pending, so it won't
be queued hence can release the rpm ref count now only.
> intel_runtime_pm_put();
>
> and dropping the reference at the end of the work item.
This will be just like the __intel_autoenable_gt_powersave
> This would make the flush_work() a nop in case of runtime_suspend().
So can call the flush_work unconditionally.
Hope I understood it correctly.
Best regards
Akash
> --Imre
>
More information about the Intel-gfx
mailing list