[Intel-xe] [PATCH V2] drm/xe: make GT sysfs init return void

Dixit, Ashutosh ashutosh.dixit at intel.com
Tue Jul 11 15:23:32 UTC 2023


On Tue, 11 Jul 2023 03:42:11 -0700, Andi Shyti wrote:
>

Hi Andi,

> On Wed, Jul 05, 2023 at 06:37:46PM +0200, Nirmoy Das wrote:
> > Hi Ashutosh,
> >
> > On 7/5/2023 5:47 PM, Dixit, Ashutosh wrote:
> > > On Wed, 05 Jul 2023 08:39:20 -0700, Nirmoy Das wrote:
> > > > Hi Ashutosh,
> > > >
> > > > On 7/5/2023 4:06 PM, Dixit, Ashutosh wrote:
> > > > > On Wed, 05 Jul 2023 01:44:03 -0700, Tejas Upadhyay wrote:
> > > > > > Currently return from xe_gt_sysfs_init() is ignored
> > > > > > and also a failure in xe_gt_sysfs_init() isn't fatal
> > > > > > so make it return void.
> > > > > But why is the failure not fatal? I really don't understand the concept of
> > > > > these non-fatal failures. Do we really want to say the device is up if
> > > > > sysfs initialization has failed for some reason and people are unable to
> > > > > see card freq's e.g.? This was done in i915 but do we really want to repeat
> > > > > this for xe? IMO the simplest thing to do would be to fail the probe unless
> > > > > ALL required/intended functionality is clearly up.
> > > >
> > > > I agree with the concern but the situation is different with a graphics
> > > > driver.
> > > >
> > > > If we return error on probe, (if I am not wrong) a user will have no way to
> > > > interact
> > > >
> > > > with the system other than ssh. We should ignore non-fatal error and let
> > > > the driver load
> > > >
> > > > so a user can have something to work with(may be report a bug :) )
> > > Hmm, good point. Agreed :)
> > >
> > > This way though only display is critical and everything else non-critical?
> >
> > Yes, that would be wrong, I am not saying that. We do return error during
> > the probe at multiple locations,
> >
> > I believe we can prioritize system usability by considering this specific
> > error as non-critical. Although those sysfs files are important,
>
> I have been one of the supporters of the non fatal failures in
> sysfs for a usability reason. A big warning printed should be
> more than enaough while the driver can still be up and running.

Yes ok, that was the reason I acked the patch.

>
> Besides, I believe that if sysfs fails, then most probably the
> system has something wrong.

Afaiu it sysfs creation failure can only happen in extreme low memory
conditions. But my conclusion from that is exactly the opposite. If that is
the case, why make exceptions in the code and split failures into fatal and
non-fatal, why not consider all failures as fatal and implement identical
error handling in all cases? If sysfs is failing something else would fail
too.

Not having display is fatal for someone who cannot ssh into the system. Not
having sysfs is fatal for someone who wants to do performance analysis and
wants to see GPU freq's.

Thanks.
--
Ashutosh


More information about the Intel-xe mailing list