[Intel-gfx] [PATCH 19/19] drm/i915: Sync against the GuC log buffer flush work item on system suspend

Goel, Akash akash.goel at intel.com
Wed Aug 17 15:37:17 UTC 2016



On 8/17/2016 6:41 PM, Imre Deak wrote:
> On ke, 2016-08-17 at 18:15 +0530, Goel, Akash wrote:
>>
>> On 8/17/2016 5:11 PM, Chris Wilson wrote:
>>> On Wed, Aug 17, 2016 at 12:27:30PM +0100, Tvrtko Ursulin wrote:
>>>>
>>>> On 17/08/16 11:14, akash.goel at intel.com wrote:
>>>>> From: Akash Goel <akash.goel at intel.com>
>>>>>
>>>>> The GuC log buffer flush work item does a register access to send the ack
>>>>> to GuC and this work item, if not synced before suspend, can potentially
>>>>> get executed after the GFX device is suspended.
>>>>> The work item function uses rpm_get/rpm_put calls around the Hw access,
>>>>> this covers the runtime suspend case but for system suspend case (which can
>>>>> be done asychronously/forcefully) sync would be required as kernel can
>>>>> potentially schedule the work items even after some devices, including GFX,
>>>>> have been put to suspend.
>>>>> Also sync has to be done conditionally i.e. only for the system suspend
>>>>> case, as sync along with rpm_get/rpm_put calls can cause a deadlock for rpm
>>>>> suspend path.
>>>>>
>>>>> Cc: Imre Deak <imre.deak at intel.com>
>>>>> Signed-off-by: Akash Goel <akash.goel at intel.com>
>>>>> ---
>>>>>  drivers/gpu/drm/i915/i915_drv.c            | 4 ++--
>>>>>  drivers/gpu/drm/i915/i915_guc_submission.c | 8 +++++++-
>>>>>  drivers/gpu/drm/i915/intel_guc.h           | 2 +-
>>>>>  3 files changed, 10 insertions(+), 4 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
>>>>> index cdee60b..2ae0ad4 100644
>>>>> --- a/drivers/gpu/drm/i915/i915_drv.c
>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.c
>>>>> @@ -1427,7 +1427,7 @@ static int i915_drm_suspend(struct drm_device *dev)
>>>>>  		goto out;
>>>>>  	}
>>>>>
>>>>> -	intel_guc_suspend(dev);
>>>>> +	intel_guc_suspend(dev, false);
>>>>>
>>>>>  	intel_display_suspend(dev);
>>>>>
>>>>> @@ -2321,7 +2321,7 @@ static int intel_runtime_suspend(struct device *device)
>>>>>  	i915_gem_release_all_mmaps(dev_priv);
>>>>>  	mutex_unlock(&dev->struct_mutex);
>>>>>
>>>>> -	intel_guc_suspend(dev);
>>>>> +	intel_guc_suspend(dev, true);
>>>>>
>>>>>  	intel_runtime_pm_disable_interrupts(dev_priv);
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
>>>>> index ef0c116..1af8a8b 100644
>>>>> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
>>>>> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
>>>>> @@ -1519,7 +1519,7 @@ void i915_guc_submission_fini(struct drm_i915_private *dev_priv)
>>>>>   * intel_guc_suspend() - notify GuC entering suspend state
>>>>>   * @dev:	drm device
>>>>>   */
>>>>> -int intel_guc_suspend(struct drm_device *dev)
>>>>> +int intel_guc_suspend(struct drm_device *dev, bool rpm_suspend)
>>>>>  {
>>>>>  	struct drm_i915_private *dev_priv = to_i915(dev);
>>>>>  	struct intel_guc *guc = &dev_priv->guc;
>>>>> @@ -1530,6 +1530,12 @@ int intel_guc_suspend(struct drm_device *dev)
>>>>>  		return 0;
>>>>>
>>>>>  	gen9_disable_guc_interrupts(dev_priv);
>>>>> +	/* Sync is needed only for the system suspend case, runtime suspend
>>>>> +	 * case is covered due to rpm get/put calls used around Hw access in
>>>>> +	 * the work item function.
>>>>> +	 */
>>>>> +	if (!rpm_suspend && (i915.guc_log_level >= 0))
>>>>> +		flush_work(&dev_priv->guc.log.flush_work);
>>>
>>> In which case (rpm suspend) the flush_work is idle and this a noop. That
>>> you have to pass around such state suggests that you are papering over a
>>> bug?
>> In case of rpm suspend the flush_work may not be a NOOP.
>> Can use the flush_work for runtime suspend also but in spite of that
>> can't prevent the 'RPM wakelock' asserts, as the work item can get
>> executed after the rpm ref count drops to zero and before runtime
>> suspend kicks in (after autosuspend delay).
>>
>> For that you had earlier suggested to use rpm get/put in the work item
>> function, around the register access, but with that had to remove the
>> flush_work from the suspend hook, otherwise a deadlock can happen.
>> So doing the flush_work conditionally for system suspend case, as rpm
>> get/put won't cause the resume of device in that case.
>>
>> Actually I had discussed about this with Imre and as per his inputs
>> prepared this patch.
>
> There would be this alternative:
>
Thanks much for suggesting the alternate approach.

Just to confirm whether I understood everything correctly,

> in gen9_guc_irq_handler():
>    WARN_ON(!intel_runtime_pm_get_if_in_use());
Used WARN, as we don't expect the device to be suspended at this 
juncture, so intel_runtime_pm_get_if_in_use() should return true.

>    if (!queue_work(log.flush_work))
If queue_work returns 0, then work item is already pending, so it won't
be queued hence can release the rpm ref count now only.
>        intel_runtime_pm_put();

>
> and dropping the reference at the end of the work item.
This will be just like the __intel_autoenable_gt_powersave

> This would make the flush_work() a nop in case of runtime_suspend().
So can call the flush_work unconditionally.

Hope I understood it correctly.

Best regards
Akash

> --Imre
>


More information about the Intel-gfx mailing list