[Intel-gfx] [PATCH] drm/i915: Add intel_pcode_probe

Sundaresan, Sujaritha sujaritha.sundaresan at intel.com
Mon Aug 21 12:52:09 UTC 2023


On 8/18/2023 6:29 PM, Rodrigo Vivi wrote:
> On Fri, Aug 18, 2023 at 11:32:27AM +0530, Sundaresan, Sujaritha wrote:
>> On 8/18/2023 11:30 AM, Gupta, Anshuman wrote:
>>>> -----Original Message-----
>>>> From: Intel-gfx <intel-gfx-bounces at lists.freedesktop.org> On Behalf Of
>>>> Sujaritha Sundaresan
>>>> Sent: Friday, August 18, 2023 8:16 AM
>>>> To: intel-gfx at lists.freedesktop.org
>>>> Subject: [Intel-gfx] [PATCH] drm/i915: Add intel_pcode_probe
>>>>
>>>> Added intel_pcode_probe, promoted wait for lmem init and intel_pcode_init
>>>> prior to mmio_probe during load, so that GT registers can be accessed only
>>>> after this, else MCA is observed.
>>>>
>>>> Signed-off-by: Sujaritha Sundaresan <sujaritha.sundaresan at intel.com>
>>> Both DG1 and DG2 crashed during i915_pci_probe.
>>> BAT is failing.
>>> Thanks,
>>> Anshuman Gupta.
>> Hi Anshuman,
>>
>> Yes I'm currently looking into it.
>>
>> Thanks,
>>
>> Suja
>>
>>>> ---
>>>>    drivers/gpu/drm/i915/i915_driver.c  | 37 ++++++++++++++++++++++++-----
>>>> drivers/gpu/drm/i915/intel_uncore.c | 12 ----------
>>>>    2 files changed, 31 insertions(+), 18 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/i915_driver.c
>>>> b/drivers/gpu/drm/i915/i915_driver.c
>>>> index f8dbee7a5af7..92cafceaf447 100644
>>>> --- a/drivers/gpu/drm/i915/i915_driver.c
>>>> +++ b/drivers/gpu/drm/i915/i915_driver.c
>>>> @@ -93,6 +93,7 @@
>>>>    #include "i915_memcpy.h"
>>>>    #include "i915_perf.h"
>>>>    #include "i915_query.h"
>>>> +#include "i915_reg.h"
>>>>    #include "i915_suspend.h"
>>>>    #include "i915_switcheroo.h"
>>>>    #include "i915_sysfs.h"
>>>> @@ -436,6 +437,32 @@ static int i915_pcode_init(struct drm_i915_private
>>>> *i915)
>>>>    	return 0;
>>>>    }
>>>>
>>>> +static int intel_pcode_probe(struct drm_i915_private *i915) {
>>>> +	struct intel_uncore *uncore;
>>>> +	int ret;
>>>> +
>>>> +	/*
>>>> +	 * The boot firmware initializes local memory and assesses its health.
>>>> +	 * If memory training fails, the punit will have been instructed to
>>>> +	 * keep the GT powered down; we won't be able to communicate
>>>> with it
>>>> +	 * and we should not continue with driver initialization.
>>>> +	 */
>>>> +	if (IS_DGFX(i915) &&
>>>> +		!(__raw_uncore_read32(uncore, GU_CNTL) & LMEM_INIT))
>>>> {
>>>> +		drm_err(&i915->drm, "LMEM not initialized by firmware\n");
>>>> +		return -ENODEV;
>>>> +	}
>>>> +
>>>> +	/*
>>>> +	 * Driver handshakes with pcode via mailbox command to know that
>>>> SoC
>>>> +	 * initialization is complete before proceeding further
>>>> +	 */
>>>> +	ret = i915_pcode_init(i915);
>>>> +
>>>> +	return ret;
>>>> +}
>>>> +
>>>>    /**
>>>>     * i915_driver_hw_probe - setup state requiring device access
>>>>     * @dev_priv: device private
>>>> @@ -547,10 +574,6 @@ static int i915_driver_hw_probe(struct
>>>> drm_i915_private *dev_priv)
>>>>
>>>>    	intel_opregion_setup(dev_priv);
>>>>
>>>> -	ret = i915_pcode_init(dev_priv);
>>>> -	if (ret)
>>>> -		goto err_opregion;
>>>> -
>>>>    	/*
>>>>    	 * Fill the dram structure to get the system dram info. This will be
>>>>    	 * used for memory latency calculation.
>>>> @@ -561,8 +584,6 @@ static int i915_driver_hw_probe(struct
>>>> drm_i915_private *dev_priv)
>>>>
>>>>    	return 0;
>>>>
>>>> -err_opregion:
>>>> -	intel_opregion_cleanup(dev_priv);
>>>>    err_msi:
>>>>    	if (pdev->msi_enabled)
>>>>    		pci_disable_msi(pdev);
>>>> @@ -778,6 +799,10 @@ int i915_driver_probe(struct pci_dev *pdev, const
>>>> struct pci_device_id *ent)
>>>>    	if (ret < 0)
>>>>    		goto out_runtime_pm_put;
>>>>
>>>> +	ret = intel_pcode_probe(i915);
>>>> +	if (ret)
>>>> +		goto out_tiles_cleanup;
>>>> +
>>>>    	ret = i915_driver_mmio_probe(i915);
> chicken-egg problem here?!
>
> I don't believe this could ever work. You need the MMIO space to be able
> to communicate with PCODE mailbox and check the lmem init, no?!
>
> I believe the bug is that PCODE check should come before the LMEM_INIT
> check.
>
> LMEM won't be ready before pcode state that everything was ready for
> the lmem access. And on your code pcode ready check is still after
> the lmem.
>
> Cc: Aravind Iddamsetty <aravind.iddamsetty at intel.com>
>
> who was recently raising that we had an order problem there.

Yes looks like there is definitely an ordering issue.

Will look into this more.

>
>>>>    	if (ret < 0)
>>>>    		goto out_tiles_cleanup;
>>>> diff --git a/drivers/gpu/drm/i915/intel_uncore.c
>>>> b/drivers/gpu/drm/i915/intel_uncore.c
>>>> index dfefad5a5fec..4a353d4adf86 100644
>>>> --- a/drivers/gpu/drm/i915/intel_uncore.c
>>>> +++ b/drivers/gpu/drm/i915/intel_uncore.c
>>>> @@ -2658,18 +2658,6 @@ int intel_uncore_init_mmio(struct intel_uncore
>>>> *uncore)
>>>>    	if (ret)
>>>>    		return ret;
>>>>
>>>> -	/*
>>>> -	 * The boot firmware initializes local memory and assesses its health.
>>>> -	 * If memory training fails, the punit will have been instructed to
>>>> -	 * keep the GT powered down; we won't be able to communicate
>>>> with it
>>>> -	 * and we should not continue with driver initialization.
>>>> -	 */
>>>> -	if (IS_DGFX(i915) &&
>>>> -	    !(__raw_uncore_read32(uncore, GU_CNTL) & LMEM_INIT)) {
>>>> -		drm_err(&i915->drm, "LMEM not initialized by firmware\n");
>>>> -		return -ENODEV;
>>>> -	}
>>>> -
>>>>    	if (GRAPHICS_VER(i915) > 5 && !intel_vgpu_active(i915))
>>>>    		uncore->flags |= UNCORE_HAS_FORCEWAKE;
>>>>
>>>> --
>>>> 2.41.0


More information about the Intel-gfx mailing list