[Intel-gfx] [PATCH v2] PCI / PM: tune down RPM suspend error message with EBUSY and EAGAIN retval
Rafael J. Wysocki
rafael.j.wysocki at intel.com
Fri Nov 27 06:44:48 PST 2015
On 11/27/2015 12:39 PM, Jani Nikula wrote:
> On Wed, 18 Nov 2015, Daniel Vetter <daniel at ffwll.ch> wrote:
>> On Wed, Nov 18, 2015 at 03:28:38PM +0200, Imre Deak wrote:
>>> On ke, 2015-11-18 at 12:56 +0200, Imre Deak wrote:
>>>> The runtime PM core doesn't treat EBUSY and EAGAIN retvals from the driver
>>>> suspend hooks as errors, but they still show up as errors in dmesg. Tune
>>>> them down.
>>>>
>>>> One problem caused by this was noticed by Daniel: the i915 driver
>>>> returns EAGAIN to signal a temporary failure to suspend and as a request
>>>> towards the RPM core for scheduling a suspend again. This is a normal
>>>> event, but the resulting error message flags a breakage during the
>>>> driver's automated testing which parses dmesg and picks up the error.
>>>>
>>>> v2:
>>>> - fix compile breake when CONFIG_PM_SLEEP=n (0-day builder)
>>>>
>>>> Reported-by: Daniel Vetter <daniel.vetter at intel.com>
>>>> Signed-off-by: Imre Deak <imre.deak at intel.com>
>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92992
>> Reviewed-by: Daniel Vetter <daniel.vetter at ffwll.ch>
>>
>> Rafael, can you please pick this up for 4.4? The spurious KERN_ERR noise
>> in dmesg is causing a lot fo spurious fail in our (very recently put into
>> place) i915 CI system.
> Rafael, ping.
Well, so I'm not sure about this one.
And the question is ->
>>>> ---
>>>> drivers/base/power/main.c | 7 +++++--
>>>> drivers/pci/pci-driver.c | 2 +-
>>>> include/linux/pm.h | 11 +++++++++--
>>>> 3 files changed, 15 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
>>>> index 1710c26..39d2090 100644
>>>> --- a/drivers/base/power/main.c
>>>> +++ b/drivers/base/power/main.c
>>>> @@ -1679,9 +1679,12 @@ int dpm_suspend_start(pm_message_t state)
>>>> }
>>>> EXPORT_SYMBOL_GPL(dpm_suspend_start);
>>>>
>>>> -void __suspend_report_result(const char *function, void *fn, int ret)
>>>> +void __suspend_report_result(const char *function, void *fn, int ret,
>>>> + bool runtime_pm)
>>>> {
>>>> - if (ret)
>>>> + if (runtime_pm && (ret == -EBUSY || ret == -EAGAIN))
>>>> + printk(KERN_DEBUG "%s(): %pF returns %d\n", function, fn, ret);
>>>> + else if (ret)
>>>> printk(KERN_ERR "%s(): %pF returns %d\n", function, fn, ret);
>>>> }
-> why you are adding overhead to this function, instead of -->
>>>> EXPORT_SYMBOL_GPL(__suspend_report_result);
>>>> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
>>>> index 108a311..9569572 100644
>>>> --- a/drivers/pci/pci-driver.c
>>>> +++ b/drivers/pci/pci-driver.c
>>>> @@ -1142,7 +1142,7 @@ static int pci_pm_runtime_suspend(struct device *dev)
>>>> pci_dev->state_saved = false;
>>>> pci_dev->no_d3cold = false;
>>>> error = pm->runtime_suspend(dev);
>>>> - suspend_report_result(pm->runtime_suspend, error);
>>>> + rpm_suspend_report_result(pm->runtime_suspend, error);
--> replacing the suspend_report_result() above with a direct printk()
in the if (error) block below.
Surely, suspend_report_result() was not designed with runtime PM in mind
and it was a mistake to use it here. It just seemed to do the right
thing, but it clearly doesn't.
>>>> if (error)
>>>> return error;
>>>> if (!pci_dev->d3cold_allowed)
>>>> diff --git a/include/linux/pm.h b/include/linux/pm.h
>>>> index 35d599e..54f37e3 100644
>>>> --- a/include/linux/pm.h
>>>> +++ b/include/linux/pm.h
>>>> @@ -702,11 +702,17 @@ extern int dpm_suspend_late(pm_message_t state);
>>>> extern int dpm_suspend(pm_message_t state);
>>>> extern int dpm_prepare(pm_message_t state);
>>>>
>>>> -extern void __suspend_report_result(const char *function, void *fn, int ret);
>>>> +extern void __suspend_report_result(const char *function, void *fn, int ret,
>>>> + bool runtime_pm);
>>>>
>>>> #define suspend_report_result(fn, ret) \
>>>> do { \
>>>> - __suspend_report_result(__func__, fn, ret); \
>>>> + __suspend_report_result(__func__, fn, ret, false); \
>>>> + } while (0)
>>>> +
>>>> +#define rpm_suspend_report_result(fn, ret) \
>>>> + do { \
>>>> + __suspend_report_result(__func__, fn, ret, true); \
>>>> } while (0)
>>>>
>>>> extern int device_pm_wait_for_dev(struct device *sub, struct device *dev);
>>>> @@ -744,6 +750,7 @@ static inline int dpm_suspend_start(pm_message_t state)
>>>> }
>>>>
>>>> #define suspend_report_result(fn, ret) do {} while (0)
>>>> +#define rpm_suspend_report_result(fn, ret) do {} while (0)
>>>>
>>>> static inline int device_pm_wait_for_dev(struct device *a, struct device *b)
>>>> {
BTW, if you're changing PM code, it is good to CC linux-pm too (now
done) and if you're changing PCI code, it is mandatory to CC linux-pci
and the PCI maintainer (now done too).
Thanks,
Rafael
More information about the Intel-gfx
mailing list