[PATCH] drm/amdgpu: Register bad page handler for Aldebaran
Borislav Petkov
bp at alien8.de
Fri May 14 14:38:14 UTC 2021
On Fri, May 14, 2021 at 01:06:33PM +0000, Joshi, Mukul wrote:
> We have RAS functionality in other ASICs that is not dependent on
> CONFIG_X86_MCE_AMD. So, I don't think we would want to do that just
> for one ASIC.
Lemme try again: you said that those errors do get reported through a
deferred interrupt. Which is likely amd_deferred_error_interrupt().
If it is that interrupt and you don't have CONFIG_X86_MCE_AMD enabled,
then you won't get any errors reported and your RAS functionality will
simply sit there inactive.
So if that above is true - something to which I'm still not getting
an answer but maybe one fine day... - so if that above is true, your
RAS functionality *needs* CONFIG_X86_MCE_AMD to be enabled in order to
*actually* function.
So you *must* make your RAS functionality depend on CONFIG_X86_MCE_AMD
- otherwise no deferred interrupts and no errors reported. It is that
simple.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
More information about the amd-gfx
mailing list