`AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=y` causes AMDGPU to fail on Ryzen: amdgpu: SME is not compatible with RAVEN

Tom Lendacky thomas.lendacky at amd.com
Wed Oct 6 14:01:56 UTC 2021


On 10/6/21 8:23 AM, Alex Deucher wrote:
> On Wed, Oct 6, 2021 at 5:42 AM Borislav Petkov <bp at alien8.de> wrote:
>>
>> On Tue, Oct 05, 2021 at 10:48:15AM -0400, Alex Deucher wrote:
>>> It's not incompatible per se, but SEM requires the IOMMU be enabled
>>> because the C bit used for encryption is beyond the dma_mask of most
>>> devices.  If the C bit is not set, the en/decryption for DMA doesn't
>>> occur.  So you need IOMMU to be enabled in remapping mode to use SME
>>> with most devices.  Raven has further requirements in that it requires
>>> IOMMUv2 functionality to support some features which currently uses a
>>> direct mapping in the IOMMU and hence the C bit is not properly
>>> handled.
>>
>> So lemme ask you this: do Raven-containing systems exist out there which
>> don't have IOMMUv2 functionality and which can cause boot failures when
>> SME is enabled in the kernel .config?
> 
> There could be some OEM systems that disable the IOMMU on the platform
> and don't provide a switch in the bios to enable it.  The GPU driver
> will still work in that case, it will just not be able to enable KFD
> support for ROCm compute.  SME won't work for most devices in that
> case however since most devices have a DMA mask too small to handle
> the C bit for encryption.  SME should be dependent on IOMMU being
> enabled.

That's not completely true. If the IOMMU is not enabled (off or in 
passthrough mode), then the DMA api will check the DMA mask and use 
SWIOTLB to bounce the DMA if the device doesn't support DMA at the 
position where the c-bit is located (see force_dma_unencrypted() in 
arch/x86/mm/mem_encrypt.c).

To avoid bounce buffering, though, commit 2cc13bb4f59f was introduced to 
disable passthrough mode when SME is active (unless iommu=pt was 
explicitly specified).

Thanks,
Tom

> 
>>
>> IOW, can we handle this at boot time properly, i.e., disable SME if we
>> detect Raven or IOMMUv2 support is missing?
>>
>> If not, then we really will have to change the default.
> 
> I'm not an SME expert, but I thought that that was already the case.
> We just added the error condition in the GPU driver to prevent the
> driver from loading when the user forced SME on.  IIRC, there were
> users that cared more about SME than graphics support.
> 
> Alex
> 
>>
>> Thx.
>>
>> --
>> Regards/Gruss,
>>      Boris.
>>
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeople.kernel.org%2Ftglx%2Fnotes-about-netiquette&data=04%7C01%7Cthomas.lendacky%40amd.com%7Cbab2eedbc1704f90f63408d988cc7fb2%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637691234178637291%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=xCXc1pcfJiWvKG1DTJKq986Ecid8M7M7K3gvCDWrZL8%3D&reserved=0


More information about the amd-gfx mailing list