[PATCH 0/2] Add configfs support for survivability mode
Rodrigo Vivi
rodrigo.vivi at intel.com
Mon Mar 31 20:19:28 UTC 2025
On Thu, Mar 27, 2025 at 09:40:39AM -0500, Lucas De Marchi wrote:
> On Thu, Mar 27, 2025 at 12:12:00PM +0530, Riana Tauro wrote:
> > This series proposes to expose attributes via xe configfs
> > subsystem. Xe registers a configfs subsystem named 'xe'.
> > Userspace can then create directories for the devices they
> > want to configure and set appropriate attributes
> >
> > This is done by
> >
> > mount -t configfs none /config
> > mkdir /config/xe/0000:03:00.0
> >
>
> If we need a new version or to document anywhere in our docs, I'd add a
> comment here:
>
> # If driver is already bound, unbind it as this configuration
> # applies only when probing it
>
> > echo 0000:03:00.0 > /sys/bus/pci/drivers/xe/unbind
> > echo 1 > sys/kernel/config/xe/0000:03:00.0/survivability_mode
> > echo 0000:03:00.0 > /sys/bus/pci/drivers/xe/bind
> >
> > This is an alternative to introducing module param that causes
> > all the connected and supported GPU cards to enter survivability mode.
> > Manually entering survivability mode is useful when pcode does not
> > report failure, in field repairs and validation
> >
> > Rev2: use config_groups (Lucas)
>
> Awesome. I have some other work pending that will make use of
> it. I will play with these patches soon.
I really liked this new flow and I was giving it a try here right now.
However it didn't work. It didn't take me to the survivability mode,
but also, I cannot unload the xe after creating this configfs file:
sudo remove /sys/kernel/config/xe/0000\:0*
rm: cannot remove '0000:00:02.0/survivability_mode': Operation not permitted
rm: cannot remove '0000:03:00.0/survivability_mode': Operation not permitted
Tried to unbind and had the same failure.
then with the configfs there we cannot remove the module:
$ sudo rmmod xe
rmmod: ERROR: Module xe is in use
So, it looks we have some stuff to adjust here before we can move further,
but so far things are looking promising indeed
>
> thanks
> Lucas De Marchi
More information about the Intel-xe
mailing list