[PATCH v2] drm: Ensure Proper Unload/Reload Order of MEI Modules for i915/Xe Driver

Bommu, Krishnaiah krishnaiah.bommu at intel.com
Thu Sep 12 11:58:37 UTC 2024



> -----Original Message-----
> From: De Marchi, Lucas <lucas.demarchi at intel.com>
> Sent: Wednesday, September 11, 2024 9:49 PM
> To: Bommu, Krishnaiah <krishnaiah.bommu at intel.com>
> Cc: Vivi, Rodrigo <rodrigo.vivi at intel.com>; intel-xe at lists.freedesktop.org; intel-
> gfx at lists.freedesktop.org; Kamil Konieczny <kamil.konieczny at linux.intel.com>;
> Ceraolo Spurio, Daniele <daniele.ceraolospurio at intel.com>; Upadhyay, Tejas
> <tejas.upadhyay at intel.com>; Tvrtko Ursulin <tursulin at ursulin.net>; Joonas
> Lahtinen <joonas.lahtinen at linux.intel.com>; Nikula, Jani
> <jani.nikula at intel.com>; Thomas Hellström
> <thomas.hellstrom at linux.intel.com>; Teres Alexis, Alan Previn
> <alan.previn.teres.alexis at intel.com>; Winkler, Tomas
> <tomas.winkler at intel.com>; Usyskin, Alexander
> <alexander.usyskin at intel.com>; linux-modules at vger.kernel.org; Luis
> Chamberlain <mcgrof at kernel.org>
> Subject: Re: [PATCH v2] drm: Ensure Proper Unload/Reload Order of MEI
> Modules for i915/Xe Driver
> 
> + linux-modules
> + Luis
> 
> On Wed, Sep 11, 2024 at 01:00:47AM GMT, Bommu, Krishnaiah wrote:
> >
> >
> >> -----Original Message-----
> >> From: De Marchi, Lucas <lucas.demarchi at intel.com>
> >> Sent: Tuesday, September 10, 2024 9:13 PM
> >> To: Vivi, Rodrigo <rodrigo.vivi at intel.com>
> >> Cc: Bommu, Krishnaiah <krishnaiah.bommu at intel.com>; intel-
> >> xe at lists.freedesktop.org; intel-gfx at lists.freedesktop.org; Kamil
> >> Konieczny <kamil.konieczny at linux.intel.com>; Ceraolo Spurio, Daniele
> >> <daniele.ceraolospurio at intel.com>; Upadhyay, Tejas
> >> <tejas.upadhyay at intel.com>; Tvrtko Ursulin <tursulin at ursulin.net>;
> >> Joonas Lahtinen <joonas.lahtinen at linux.intel.com>; Nikula, Jani
> >> <jani.nikula at intel.com>; Thomas Hellström
> >> <thomas.hellstrom at linux.intel.com>; Teres Alexis, Alan Previn
> >> <alan.previn.teres.alexis at intel.com>; Winkler, Tomas
> >> <tomas.winkler at intel.com>; Usyskin, Alexander
> >> <alexander.usyskin at intel.com>
> >> Subject: Re: [PATCH v2] drm: Ensure Proper Unload/Reload Order of MEI
> >> Modules for i915/Xe Driver
> >>
> >> On Tue, Sep 10, 2024 at 11:03:30AM GMT, Rodrigo Vivi wrote:
> >> >On Mon, Sep 09, 2024 at 09:33:17AM +0530, Bommu Krishnaiah wrote:
> >> >> This update addresses the unload/reload sequence of MEI modules in
> >> >> relation to the i915/Xe graphics driver. On platforms where the
> >> >> MEI hardware is integrated with the graphics device (e.g.,
> >> >> DG2/BMG), the i915/xe driver is depend on the MEI modules.
> >> >> Conversely, on newer platforms like MTL and LNL, where the MEI
> >> >> hardware is separate, this
> >> dependency does not exist.
> >> >>
> >> >> The changes introduced ensure that MEI modules are unloaded and
> >> >> reloaded in the correct order based on platform-specific
> >> >> dependencies. This is achieved by adding a MODULE_SOFTDEP
> >> >> directive to
> >> the i915 and Xe module code.
> >>
> >>
> >> can you explain what causes the modules to be loaded today? Also, is
> >> this to fix anything related to *loading* order or just unload?
> >>
> >> >>
> >> >> These changes enhance the robustness of MEI module handling across
> >> >> different hardware platforms, ensuring that the i915/Xe driver can
> >> >> be cleanly unloaded and reloaded without issues.
> >> >>
> >> >> v2: updated commit message
> >> >>
> >> >> Signed-off-by: Bommu Krishnaiah <krishnaiah.bommu at intel.com>
> >> >> Cc: Kamil Konieczny <kamil.konieczny at linux.intel.com>
> >> >> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio at intel.com>
> >> >> Cc: Lucas De Marchi <lucas.demarchi at intel.com>
> >> >> Cc: Tejas Upadhyay <tejas.upadhyay at intel.com>
> >> >> ---
> >> >>  drivers/gpu/drm/i915/i915_module.c | 2 ++
> >> >>  drivers/gpu/drm/xe/xe_module.c     | 2 ++
> >> >>  2 files changed, 4 insertions(+)
> >> >>
> >> >> diff --git a/drivers/gpu/drm/i915/i915_module.c
> >> >> b/drivers/gpu/drm/i915/i915_module.c
> >> >> index 65acd7bf75d0..2ad079ad35db 100644
> >> >> --- a/drivers/gpu/drm/i915/i915_module.c
> >> >> +++ b/drivers/gpu/drm/i915/i915_module.c
> >> >> @@ -75,6 +75,8 @@ static const struct {  };  static int
> >> >> init_progress;
> >> >>
> >> >> +MODULE_SOFTDEP("pre: mei_gsc_proxy mei_gsc");
> >> >> +
> >> >>  static int __init i915_init(void)  {
> >> >>  	int err, i;
> >> >> diff --git a/drivers/gpu/drm/xe/xe_module.c
> >> >> b/drivers/gpu/drm/xe/xe_module.c index bfc3deebdaa2..5633ea1841b7
> >> >> 100644
> >> >> --- a/drivers/gpu/drm/xe/xe_module.c
> >> >> +++ b/drivers/gpu/drm/xe/xe_module.c
> >> >> @@ -127,6 +127,8 @@ static void xe_call_exit_func(unsigned int i)
> >> >>  	init_funcs[i].exit();
> >> >>  }
> >> >>
> >> >> +MODULE_SOFTDEP("pre: mei_gsc_proxy mei_gsc");
> >> >
> >> >I'm honestly not very comfortable with this.
> >> >
> >> >1. This is not true for every device supported by these modules.
> >> >2. This is not true for every (and the most basic) functionality of these
> drivers.
> >> >
> >> >Shouldn't this be done in the the mei side?
> >>
> >> I don't think it's possible to do from the mei side. Would mei depend
> >> on both xe and i915 (and thus cause both to be loaded regardless of
> >> the platform?). For a runtime dependency like this that depends on
> >> the platform, I think the best way would be a weakdep + either a
> >> request_module() or something else that causes the module to load (is
> >> that what comp_* is doing today?)
> >>
> >> >
> >> >Couldn't at probe we identify the need of them and if needed we
> >> >return -EPROBE to attempt a retry after the mei drivers were probed?
> >>
> >> I'm not sure this is fixing anything for probe. I think we already
> >> wait on the other component to be ready without blocking the rest of the
> driver functionality.
> >>
> >> A weakdep wouldn't cause the module to be loaded where it's not
> >> needed, but need some clarification if this is trying to fix anything load-
> related or just unload.
> >
> >This change is fixing unload.
> >During xe load I am seeing mei_gsc modules was loaded, but not unloaded
> >during the unload xe
> 
> so, first thing: if things are correct in the kernel, we shouldn't need to
> **unload** the module after unbinding the device. Why are we unloading xe
> and the other modules for tests?

While running gta at xe_module_load@reload-no-display I see failure, to address this failure I have this changes, previously I am trying to fix from IGT, but as per igt review suggestion I am trying to fix issue in kernel, 
IGT patch: https://patchwork.freedesktop.org/series/137343/

> >root at DUT6127BMGFRD:/home/gta# lsmod | grep xe ------>>>just after
> >system reboot root at DUT6127BMGFRD:/home/gta#
> >root at DUT6127BMGFRD:/home/gta# lsmod | grep mei
> >mei_hdcp               28672  0
> >mei_pxp                16384  0
> >mei_me                 49152  2
> >mei                   167936  5 mei_hdcp,mei_pxp,mei_me
> >root at DUT6127BMGFRD:/home/gta# lsmod | grep xe
> >root at DUT6127BMGFRD:/home/gta# root at DUT6127BMGFRD:/home/gta#
> modprobe xe
> >root at DUT6127BMGFRD:/home/gta# root at DUT6127BMGFRD:/home/gta#
> lsmod |
> >grep mei
> >mei_gsc_proxy          16384  0
> >mei_gsc                12288  1
> 
> 			       ^ which means there's one user, which
> 			         should be xe
> 
> >mei_hdcp               28672  0
> >mei_pxp                16384  0
> >mei_me                 49152  3 mei_gsc
> >mei                   167936  8 mei_gsc_proxy,mei_gsc,mei_hdcp,mei_pxp,mei_me
> >root at DUT6127BMGFRD:/home/gta#
> >root at DUT6127BMGFRD:/home/gta#
> >root at DUT6127BMGFRD:/home/gta#
> >root at DUT6127BMGFRD:/home/gta# init 3
> >root at DUT6127BMGFRD:/home/gta# echo -n auto >
> >/sys/bus/pci/devices/0000\:03\:00.0/power/control
> >root at DUT6127BMGFRD:/home/gta# echo -n "0000:03:00.0" >
> >/sys/bus/pci/drivers/xe/unbind root at DUT6127BMGFRD:/home/gta#
> modprobe
> >-r xe root at DUT6127BMGFRD:/home/gta#
> root at DUT6127BMGFRD:/home/gta# lsmod
> >| grep xe root at DUT6127BMGFRD:/home/gta# lsmod | grep mei
> >mei_gsc_proxy          16384  0
> >mei_gsc                12288  0
> 
> 			       ^ great, so the refcount went to 0,
> 			         confirming it was xe. It should go to 0
> 				 even before you unload the module,
> 				 when unbind.
> 
> A couple of points:
> 
> 1) why do we care about unloading mei_gsc. Just loading xe
>     again (or even not even unloading it, just unbind/rebind),
>     should still work if the xe <-> mei_gsc integration is done
>     correctly.
> 
> 2) If for some reason we do want to remove the module, then we will
>     need some work in kernel/module/  to start tracking runtime module
>     dependencies, i.e. when one module does a module_get(foo->owner), it
>     would add to a list and output on sysfs together with the holders list.
>     This way you would be able to track the runtime deps and remove them
>     if their refcount went to 0 after removing xe.
> 
> (2) is doable, but previous attempts were not successful [1]. Is  there something
> else to make the simpler solution (1) to work?
> 

Reference why I am doing this changes, please see review comments of this patch https://patchwork.freedesktop.org/series/137343/

Regards,
Krishna.

> thanks
> Lucas De Marchi
> 
> [1] https://lore.kernel.org/linux-
> modules/cover.1652113087.git.mchehab at kernel.org/
> 
> >mei_hdcp               28672  0
> >mei_pxp                16384  0
> >mei_me                 49152  3 mei_gsc
> >mei                   167936  7 mei_gsc_proxy,mei_gsc,mei_hdcp,mei_pxp,mei_me
> >root at DUT6127BMGFRD:/home/gta#
> >
> >Regards,
> >Krishna.
> >
> >>
> >> Lucas De Marchi
> >>
> >> >
> >> >Cc: Alexander Usyskin <alexander.usyskin at intel.com>
> >> >Cc: Tomas Winkler <tomas.winkler at intel.com>
> >> >Cc: Alan Previn <alan.previn.teres.alexis at intel.com>
> >> >Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio at intel.com>
> >> >Cc: Lucas De Marchi <lucas.demarchi at intel.com>
> >> >Cc: Thomas Hellström <thomas.hellstrom at linux.intel.com>
> >> >Cc: Jani Nikula <jani.nikula at intel.com>
> >> >Cc: Joonas Lahtinen <joonas.lahtinen at linux.intel.com>
> >> >Cc: Tvrtko Ursulin <tursulin at ursulin.net>
> >> >
> >> >> +
> >> >>  static int __init xe_init(void)
> >> >>  {
> >> >>  	int err, i;
> >> >> --
> >> >> 2.25.1
> >> >>


More information about the Intel-gfx mailing list