[PATCH 6/9] drm/xe: Move survivability entirely to xe_pci
Lucas De Marchi
lucas.demarchi at intel.com
Fri Feb 21 23:12:17 UTC 2025
On Thu, Feb 20, 2025 at 11:11:30AM +0530, Riana Tauro wrote:
>Hi Lucas
>
>On 2/17/2025 10:58 PM, Lucas De Marchi wrote:
>>On Mon, Feb 17, 2025 at 10:56:22AM +0530, Riana Tauro wrote:
>>>
>>>
>>>On 2/15/2025 2:53 AM, Lucas De Marchi wrote:
>>>>There's an odd split between xe_pci.c and xe_device.c wrt
>>>>xe_survivability: it's initialized by xe_device, but then finalized by
>>>>xe_pci. Move it entirely to the outer layer, xe_pci, so it controls
>>>>the flow entirely.
>>>Hi Lucas
>>>
>>>device_probe_early has other init calls that return error. And
>>>since this occurs only when pcode probe fails, added it there.
>>
>>right, but it's very confusing to have this flow with both xe_pci and
>>xe_device playing a different role on init and fini.
>>
>>>
>>>I hadn't added the fini in the devm_action because of the
>>>pci_set_drvdata.
>>
>>which is now fixed as a prep patch in this series.
>>
>>>
>>>As, the remove function is moved to devm_action. IMO it would be better
>>>if survivability_init stays in the err condition of pcode probe
>>>because if someone decides to move pcode_probe to some other
>>>function, it would be intuitive to move this too
>>
>>but from entering survivability mode, it would still be after
>>xe_device_probe_**early**().
>>
>>An **early** error in xe_device probe, by means of having a call
>>xe_device_probe_early() means a very fundamental issue with firmware and
>>we'd better enter a mode that allows us to recover from that. If the
>>call to pcode was moved somewhere else after that, it's mistake that we
>>should fail in CI (btw we need a way to do that in CI).
>
>Currently there is no CI for this as it needs a firmware failure. Even
>if module param was added in the future it would still not fail if
>pcode probe was moved.
>
>Maybe can add a comment on top of the survivability_mode_enable in pci layer
ok. I'm rewording the comment that is already there to try to capture
this change that is about failing on early probe.
thanks
Lucas De Marchi
More information about the Intel-xe
mailing list