[Intel-gfx] [PATCH 1/7] drm/i915: Disable preemption and sleeping while using the punit sideband

Chris Wilson chris at chris-wilson.co.uk
Wed Jan 10 14:02:01 UTC 2018


Quoting Mika Kuoppala (2018-01-10 13:45:27)
> Hans de Goede <hdegoede at redhat.com> writes:
> 
> > Hi,
> >
> > On 10-01-18 13:55, Chris Wilson wrote:
> >> While we talk to the punit over its sideband, we need to prevent the cpu
> >> from sleeping in order to prevent a potential machine hang.
> >> 
> >> Note that by itself, it appears that pm_qos_update_request (via
> >> intel_idle) doesn't provide a sufficient barrier to ensure that all core
> >> are indeed awake (out of Cstate) and that the package is awake. To do so,
> >> we need to supplement the pm_qos with a manual ping on_each_cpu.
> >> 
> >> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=109051
> >> References: https://bugs.freedesktop.org/show_bug.cgi?id=102657
> >> References: https://bugzilla.kernel.org/show_bug.cgi?id=195255
> >> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> >> Cc: Mika Kuoppala <mika.kuoppala at linux.intel.com>
> >> Cc: Hans de Goede <hdegoede at redhat.com>
> >
> > Interesting, I've added similar pm_qos code in
> > drivers/i2c/busses/i2c-designware-baytrail.c quite a while ago because
> > the CPU transitioning to higher C-state while accessing the i2c bus to
> > the pmic (if it is shared) also causes the SoC to hang.
> >
> > I could reproduce this quite easily by doing "i2cdump" on the pmic,
> > usually the system would hang in one or 2 i2cdump calls.
> >
> > Note IIRC this was on CHT.
> >
> > I see that you also block any pmic-i2c bus accesses while doing
> > punit access by calling iosf_mbi_punit_acquire();
> >
> > Maybe we need to move the pm_qos stuff out of
> > drivers/i2c/busses/i2c-designware-baytrail.c
> >
> > And into iosf_mbi_punit_acquire? The i2c-designware-baytrail.c
> > does its own pm_qos dance directly after calling
> > iosf_mbi_punit_acquire / before calling iosf_mbi_punit_release();
> >
> > Note the i2c-designware-baytrail.c version lacks the ping, but it
> > should, probably have it too.
> 
> The ping, if it deemed worthy, should find its way into intel_idle
> parts.

It's definitely the missing piece in the puzzle. Without the
on_each_cpu(ping), the machine hangs remain a reoccuring nightmare.

However, the max test runtime so far with these patches has been 12h on
one j1900, not enough to be conclusive.
-Chris


More information about the Intel-gfx mailing list