[PATCH] drm/i915/gt: Continue creating engine sysfs files even after a failure
Jani Nikula
jani.nikula at linux.intel.com
Wed Sep 4 14:34:42 UTC 2024
On Wed, 04 Sep 2024, Andi Shyti <andi.shyti at linux.intel.com> wrote:
> Hi Sima,
>
> On Tue, Aug 27, 2024 at 07:05:05PM +0200, Daniel Vetter wrote:
>> On Mon, Aug 19, 2024 at 01:31:40PM +0200, Andi Shyti wrote:
>> > The i915 driver generates sysfs entries for each engine of the
>> > GPU in /sys/class/drm/cardX/engines/.
>> >
>> > The process is straightforward: we loop over the UABI engines and
>> > for each one, we:
>> >
>> > - Create the object.
>> > - Create basic files.
>> > - If the engine supports timeslicing, create timeslice duration files.
>> > - If the engine supports preemption, create preemption-related files.
>> > - Create default value files.
>> >
>> > Currently, if any of these steps fail, the process stops, and no
>> > further sysfs files are created.
>> >
>> > However, it's not necessary to stop the process on failure.
>> > Instead, we can continue creating the remaining sysfs files for
>> > the other engines. Even if some files fail to be created, the
>> > list of engines can still be retrieved by querying i915.
>> >
>> > Signed-off-by: Andi Shyti <andi.shyti at linux.intel.com>
>>
>> Uh, sysfs is uapi. Either we need it, and it _must_ be there, or it's not
>> needed, and we should delete those files probably.
>>
>> This is different from debugfs, where failures are consistently ignored
>> because that's the conscious design choice Greg made and wants supported.
>> Because debugfs is optional.
>>
>> So please make sure we correctly fail driver load if these don't register.
>> Even better would be if sysfs files are registered atomically as attribute
>> blocks, but that's an entire different can of worms. But that would really
>> clean up this code and essentially put any failure handling onto core
>> driver model and sysfs code.
>
> This comment came after I merged the patch. So far, we have been
> keeping the driver going even if sysfs fails to create, with the
> idea of "if there is something wrong let it go as far as it can
> and fail on its own".
>
> This change is just setting the behavior to what the rest of the
> interfaces are doing, so that either we change them all to fail
> the driver's probe or we have them behaving consistently as they
> are.
>
> Tvrtko, Chris, Rodrigo any opinion from your side? Shall we bail
> out as Sima is suggesting?
Are there any causes for sysfs creation errors that would be acceptable
to ignore? I didn't see any examples. Or is this just speculative?
IMO fail fast and loud. We get enough bug reports where there's some big
backtrace splash copy-pasted on the bug, but the root cause happened
much earlier and was ignored.
BR,
Jani.
--
Jani Nikula, Intel
More information about the Intel-gfx
mailing list