[Intel-gfx] [PATCH] drm/i915: Add intel_pcode_probe
Rodrigo Vivi
rodrigo.vivi at intel.com
Fri Aug 18 12:59:41 UTC 2023
On Fri, Aug 18, 2023 at 11:32:27AM +0530, Sundaresan, Sujaritha wrote:
>
> On 8/18/2023 11:30 AM, Gupta, Anshuman wrote:
> >
> > > -----Original Message-----
> > > From: Intel-gfx <intel-gfx-bounces at lists.freedesktop.org> On Behalf Of
> > > Sujaritha Sundaresan
> > > Sent: Friday, August 18, 2023 8:16 AM
> > > To: intel-gfx at lists.freedesktop.org
> > > Subject: [Intel-gfx] [PATCH] drm/i915: Add intel_pcode_probe
> > >
> > > Added intel_pcode_probe, promoted wait for lmem init and intel_pcode_init
> > > prior to mmio_probe during load, so that GT registers can be accessed only
> > > after this, else MCA is observed.
> > >
> > > Signed-off-by: Sujaritha Sundaresan <sujaritha.sundaresan at intel.com>
> > Both DG1 and DG2 crashed during i915_pci_probe.
> > BAT is failing.
> > Thanks,
> > Anshuman Gupta.
>
> Hi Anshuman,
>
> Yes I'm currently looking into it.
>
> Thanks,
>
> Suja
>
> > > ---
> > > drivers/gpu/drm/i915/i915_driver.c | 37 ++++++++++++++++++++++++-----
> > > drivers/gpu/drm/i915/intel_uncore.c | 12 ----------
> > > 2 files changed, 31 insertions(+), 18 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/i915_driver.c
> > > b/drivers/gpu/drm/i915/i915_driver.c
> > > index f8dbee7a5af7..92cafceaf447 100644
> > > --- a/drivers/gpu/drm/i915/i915_driver.c
> > > +++ b/drivers/gpu/drm/i915/i915_driver.c
> > > @@ -93,6 +93,7 @@
> > > #include "i915_memcpy.h"
> > > #include "i915_perf.h"
> > > #include "i915_query.h"
> > > +#include "i915_reg.h"
> > > #include "i915_suspend.h"
> > > #include "i915_switcheroo.h"
> > > #include "i915_sysfs.h"
> > > @@ -436,6 +437,32 @@ static int i915_pcode_init(struct drm_i915_private
> > > *i915)
> > > return 0;
> > > }
> > >
> > > +static int intel_pcode_probe(struct drm_i915_private *i915) {
> > > + struct intel_uncore *uncore;
> > > + int ret;
> > > +
> > > + /*
> > > + * The boot firmware initializes local memory and assesses its health.
> > > + * If memory training fails, the punit will have been instructed to
> > > + * keep the GT powered down; we won't be able to communicate
> > > with it
> > > + * and we should not continue with driver initialization.
> > > + */
> > > + if (IS_DGFX(i915) &&
> > > + !(__raw_uncore_read32(uncore, GU_CNTL) & LMEM_INIT))
> > > {
> > > + drm_err(&i915->drm, "LMEM not initialized by firmware\n");
> > > + return -ENODEV;
> > > + }
> > > +
> > > + /*
> > > + * Driver handshakes with pcode via mailbox command to know that
> > > SoC
> > > + * initialization is complete before proceeding further
> > > + */
> > > + ret = i915_pcode_init(i915);
> > > +
> > > + return ret;
> > > +}
> > > +
> > > /**
> > > * i915_driver_hw_probe - setup state requiring device access
> > > * @dev_priv: device private
> > > @@ -547,10 +574,6 @@ static int i915_driver_hw_probe(struct
> > > drm_i915_private *dev_priv)
> > >
> > > intel_opregion_setup(dev_priv);
> > >
> > > - ret = i915_pcode_init(dev_priv);
> > > - if (ret)
> > > - goto err_opregion;
> > > -
> > > /*
> > > * Fill the dram structure to get the system dram info. This will be
> > > * used for memory latency calculation.
> > > @@ -561,8 +584,6 @@ static int i915_driver_hw_probe(struct
> > > drm_i915_private *dev_priv)
> > >
> > > return 0;
> > >
> > > -err_opregion:
> > > - intel_opregion_cleanup(dev_priv);
> > > err_msi:
> > > if (pdev->msi_enabled)
> > > pci_disable_msi(pdev);
> > > @@ -778,6 +799,10 @@ int i915_driver_probe(struct pci_dev *pdev, const
> > > struct pci_device_id *ent)
> > > if (ret < 0)
> > > goto out_runtime_pm_put;
> > >
> > > + ret = intel_pcode_probe(i915);
> > > + if (ret)
> > > + goto out_tiles_cleanup;
> > > +
> > > ret = i915_driver_mmio_probe(i915);
chicken-egg problem here?!
I don't believe this could ever work. You need the MMIO space to be able
to communicate with PCODE mailbox and check the lmem init, no?!
I believe the bug is that PCODE check should come before the LMEM_INIT
check.
LMEM won't be ready before pcode state that everything was ready for
the lmem access. And on your code pcode ready check is still after
the lmem.
Cc: Aravind Iddamsetty <aravind.iddamsetty at intel.com>
who was recently raising that we had an order problem there.
> > > if (ret < 0)
> > > goto out_tiles_cleanup;
> > > diff --git a/drivers/gpu/drm/i915/intel_uncore.c
> > > b/drivers/gpu/drm/i915/intel_uncore.c
> > > index dfefad5a5fec..4a353d4adf86 100644
> > > --- a/drivers/gpu/drm/i915/intel_uncore.c
> > > +++ b/drivers/gpu/drm/i915/intel_uncore.c
> > > @@ -2658,18 +2658,6 @@ int intel_uncore_init_mmio(struct intel_uncore
> > > *uncore)
> > > if (ret)
> > > return ret;
> > >
> > > - /*
> > > - * The boot firmware initializes local memory and assesses its health.
> > > - * If memory training fails, the punit will have been instructed to
> > > - * keep the GT powered down; we won't be able to communicate
> > > with it
> > > - * and we should not continue with driver initialization.
> > > - */
> > > - if (IS_DGFX(i915) &&
> > > - !(__raw_uncore_read32(uncore, GU_CNTL) & LMEM_INIT)) {
> > > - drm_err(&i915->drm, "LMEM not initialized by firmware\n");
> > > - return -ENODEV;
> > > - }
> > > -
> > > if (GRAPHICS_VER(i915) > 5 && !intel_vgpu_active(i915))
> > > uncore->flags |= UNCORE_HAS_FORCEWAKE;
> > >
> > > --
> > > 2.41.0
More information about the Intel-gfx
mailing list