[Intel-gfx] [PATCH] drm/i915/DG{1, 2}: FIXME Temporary hammer to disable rpm

Wed Sep 14 14:50:38 UTC 2022


> -----Original Message-----
> From: Andi Shyti <andi.shyti at linux.intel.com>
> Sent: Wednesday, September 14, 2022 8:13 PM
> To: Vivi, Rodrigo <rodrigo.vivi at intel.com>
> Cc: Gupta, Anshuman <anshuman.gupta at intel.com>; intel-
> gfx at lists.freedesktop.org; joonas.lahtinen at linux.intel.com; Ewins, Jon
> <jon.ewins at intel.com>; andi.shyti at linux.intel.com; Auld, Matthew
> <matthew.auld at intel.com>
> Subject: Re: [PATCH] drm/i915/DG{1,2}: FIXME Temporary hammer to disable
> rpm
> 
> Hi Anshuman,
> 
> On Wed, Sep 14, 2022 at 10:33:15AM -0400, Rodrigo Vivi wrote:
> > On Wed, Sep 14, 2022 at 07:45:53PM +0530, Anshuman Gupta wrote:
> > > DG1 and DG2 has lmem, and cpu can access the lmem objects via mmap
> > > and i915 internal i915_gem_object_pin_map() for
> > > i915 own usages. Both of these methods has pre-requisite requirement
> > > to keep GFX PCI endpoint in D0 for a supported iomem transaction
> > > over PCI link. (Refer PCIe specs 5.3.1.4.1)
> > >
> > > TODO:
> > > With respect to i915_gem_object_pin_map(), every caller has to grab
> > > a wakeref if gem object lies in lmem.
> > >
> > > Till we fix all issues related to runtime PM, we need to keep
> > > runtime PM disable on both DG1 and DG2.
> > >
> > > Cc: Matthew Auld <matthew.auld at intel.com>
> > > Cc: Rodrigo Vivi <rodrigo.vivi at intel.com>
> > > Signed-off-by: Anshuman Gupta <anshuman.gupta at intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/i915_pci.c | 21 +++++++++++++++++++++
> > >  1 file changed, 21 insertions(+)
> > >
> > > diff --git a/drivers/gpu/drm/i915/i915_pci.c
> > > b/drivers/gpu/drm/i915/i915_pci.c index 77e7df21f539..f31d7f5399cc
> > > 100644
> > > --- a/drivers/gpu/drm/i915/i915_pci.c
> > > +++ b/drivers/gpu/drm/i915/i915_pci.c
> > > @@ -931,6 +931,26 @@ static const struct intel_device_info dg1_info = {
> > >  		BIT(VCS0) | BIT(VCS2),
> > >  	/* Wa_16011227922 */
> > >  	.__runtime.ppgtt_size = 47,
> > > +
> > > +	/*
> > > +	 *  FIXME: Temporary hammer to disable rpm.
> > > +	 *  As per PCIe specs 5.3.1.4.1, all iomem read write request over a PCIe
> > > +	 *  function will be unsupported in case PCIe endpoint function is in D3.
> > > +	 *  But both DG1/DG2 has a hardware bug that violates the PCIe
> > > +specs
> 
> /has/have/
> 
> > > +	 *  and supports the iomem read write transaction over PCIe bus
> > > +despite
> 
> /supports/support/
> 
> > > +	 *  endpoint is D3 state.
> > > +	 *  Due to above H/W bug, we had never observed any issue with i915
> runtime
> > > +	 *  PM versus lmem access.
> > > +	 *  But this issue gets discover when PCIe gfx endpoint's upstream
> 
> /gets discover/becomes visible/
> 
> > > +	 *  bridge enters to D3, at this point any lmem read/write access will be
> > > +	 *  returned as unsupported request. But again this issue is not observed
> > > +	 *  on every platform because it has been observed on few host
> machines
> > > +	 *  DG1/DG2 endpoint's upstream bridge does not binds with pcieport
> driver.
> 
> /binds/bind/
> 
> > > +	 *  which really disables the PCIe power savings and leaves the bridge to
> D0
> > > +	 *  state.
> > > +	 *  Let's disable i915 rpm till we fix all known issue with lmem access in
> D3.
> > > +	 */
> > > +	.has_runtime_pm = 0,
> > >  };
> > >
> > >  static const struct intel_device_info adl_s_info = { @@ -1076,6
> > > +1096,7 @@ static const struct intel_device_info dg2_info = {
> > >  	XE_LPD_FEATURES,
> > >  	.__runtime.cpu_transcoder_mask = BIT(TRANSCODER_A) |
> BIT(TRANSCODER_B) |
> > >  			       BIT(TRANSCODER_C) | BIT(TRANSCODER_D),
> > > +	.has_runtime_pm = 0,
> >
> > The FIXME msg can be smaller, but it also needs to be here.
> 
> I actually like the comment, is very clear and helps understanding the issue :)
Shall I move the comment to commit log , and keep a smaller comment for both DG1 and DG2 ?
With that I can address your comment and Rodrigo comment as well.
Keeping such a big comment at two places will not make any sense.
Thanks,
Anshuman Gupta.
> 
> Thanks again for addressing the issue and with the hope to see the proper fix
> soon:
> 
> Reviewed-by: Andi Shyti <andi.shyti at linux.intel.com>
> 
> Thanks,
> Andi
> 
> > With this in place fell free to use:
> >
> > Reviewed-by: Rodrigo Vivi <rodrigo.vivi at intel.com>
> >
> > Since the proper solution might take a while let's protect from this
> > case, regardless of any other on going discussion about the force_probe
> protection.
> >
> >
> > >  	.require_force_probe = 1,
> > >  };
> > >
> > > --
> > > 2.26.2
> > >