Decrypting tt maps in ttm

Christian König christian.koenig at amd.com
Wed Sep 20 10:48:24 UTC 2023



Am 20.09.23 um 09:36 schrieb Thomas Hellström:
> Hi, Zack,
>
> On 9/20/23 05:43, Zack Rusin wrote:
>> On Tue, 2023-09-19 at 09:47 +0200, Christian König wrote:
>>> !! External Email
>>>
>>> Am 19.09.23 um 08:56 schrieb Thomas Hellström:
>>>> On 9/19/23 07:39, Christian König wrote:
>>>>> Am 19.09.23 um 03:26 schrieb Zack Rusin:
>>>>>> On Mon, 2023-09-18 at 16:21 -0400, Alex Deucher wrote:
>>>>>>> !! External Email
>>>>>>>
>>>>>>> On Mon, Sep 18, 2023 at 3:06 PM Thomas Hellström
>>>>>>> <thomas.hellstrom at linux.intel.com> wrote:
>>>>>>>> On 9/18/23 17:52, Zack Rusin wrote:
>>>>>>>>> On Mon, 2023-09-18 at 17:13 +0200, Thomas Hellström wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> On 9/18/23 16:56, Thomas Hellström wrote:
>>>>>>>>>>> Hi Zack, Christian
>>>>>>>>>>>
>>>>>>>>>>> On 9/18/23 13:36, Christian König wrote:
>>>>>>>>>>>> Hi Zack,
>>>>>>>>>>>>
>>>>>>>>>>>> adding Thomas and Daniel.
>>>>>>>>>>>>
>>>>>>>>>>>> I briefly remember that I talked with Thomas and some other
>>>>>>>>>>>> people
>>>>>>>>>>>> about that quite a while ago as well, but I don't fully
>>>>>>>>>>>> remember the
>>>>>>>>>>>> outcome.
>>>>>>>>>>> Found one old thread, but didn't read it:
>>>>>>>>>>>
>>>>>>>>>>> https://lists.freedesktop.org/archives/dri-devel/2019-September/234100.html 
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> /Thomas
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> Ugh. Now starting to read that thread I have a vague
>>>>>>>>>> recollection it all
>>>>>>>>>> ended with not supporting mapping any device pages whatsoever
>>>>>>>>>> when SEV
>>>>>>>>>> was enabled, but rather resorting to llvmpipe and VM-local bos.
>>>>>>>>> Hi, Thomas.
>>>>>>>>>
>>>>>>>>> Thanks for finding this! I'd (of course) like to solve it
>>>>>>>>> properly and get
>>>>>>>>> vmwgfx
>>>>>>>>> running with 3d support with SEV-ES active instead of essentially
>>>>>>>>> disabling
>>>>>>>>> the
>>>>>>>>> driver when SEV-ES is active.
>>>>>>>>>
>>>>>>>>> I think there are two separate discussions there, the
>>>>>>>>> non-controversial one
>>>>>>>>> and the
>>>>>>>>> controversial one:
>>>>>>>>> 1) The non-controversial: is there a case where drivers would
>>>>>>>>> want encrypted
>>>>>>>>> memory
>>>>>>>>> for TT pages but not for io mem mappings? Because if not then as
>>>>>>>>> Christian
>>>>>>>>> pointed
>>>>>>>>> out we could just add pgprot_decrypted to ttm_io_prot and be
>>>>>>>>> essentially done.
>>>>>>>>> The
>>>>>>>>> current method of decrypting io mem but leaving sys mem mappings
>>>>>>>>> encrypted is
>>>>>>>>> a bit
>>>>>>>>> weird anyway.
>>>>>>>>>
>>>>>>>>> If the answer to that question is "yes, some driver does want the
>>>>>>>>> TT mappings
>>>>>>>>> to be
>>>>>>>>> encrypted" then your "[PATCH v2 3/4] drm/ttm, drm/vmwgfx:
>>>>>>>>> Correctly support
>>>>>>>>> support
>>>>>>>>> AMD memory encryption" solves that. I think getting one of those
>>>>>>>>> two in makes
>>>>>>>>> sense
>>>>>>>>> regardless of everything else, agreed?
>>>>>>>> Well, there is more to it I think.
>>>>>>>>
>>>>>>>> IIRC, the AMD SME encryption mode has a way for a device to 
>>>>>>>> have the
>>>>>>>> memory controller (?) encrypt / decrypt device traffic by using an
>>>>>>>> address range alias, so in theory it supports encrypted TT 
>>>>>>>> pages, and
>>>>>>>> the dma-layer may indeed hand encrypted DMA pages to TTM on such
>>>>>>>> systems
>>>>>>>> depending on the device's DMA mask. That's why I think that
>>>>>>>> force_dma_unencrypted() export was needed, and If the amdgpu 
>>>>>>>> driver
>>>>>>>> accesses TT memory in SME mode *without* pgprot_decrypted() and it
>>>>>>>> still
>>>>>>>> works, then I think that mode is actually used. How could it
>>>>>>>> otherwise work?
>>>>>>> For SME, as long as the encrypted bit is set in the physical 
>>>>>>> address
>>>>>>> used for DMA, the memory controller will handle the encrypt/decrypt
>>>>>>> for the device.  For devices with a limited dma mask, you need 
>>>>>>> to use
>>>>>>> the IOMMU so that the encrypted bit is retained when the address 
>>>>>>> hits
>>>>>>> the memory controller.
>>>>>> How does that work on systems with swiotlb, e.g. swiotlb=force, or
>>>>>> i.e. what would
>>>>>> decrypt the ttm tt mappings when copying between system and vram
>>>>>> when iommu is
>>>>>> disabled/absent?
>>>>> SME makes it mandatory that all devices can handle the physical
>>>>> address used for DMA, either native or with the help of IOMMU.
>>>>>
>>>>> Hacks like SWIOTLB are not directly supported as far as I know. Maybe
>>>>> somehow SWIOTLB manually decrypts the data while copying it or
>>>>> something like this, but I'm not 100% sure if that is actually
>>>>> implemented.
>>>>>
>>>>> Regards,
>>>>> Christian.
>>>> A bold guess after looking at various code and patches:
>>>>
>>>> 1) Devices under SME that don't support the encryption bit and SEV:
>>>> a) Coherent memory is unencrypted.
>>>> b) Streaming DMA under IOMMU: The IOMMU sets the encrypted bit.
>>>> c) Streaming DMA with SWIOTLB: The bounce buffer is unencrypted.
>>>> Copying to/from bounce-buffer decrypts/encrypts.
>>>>
>>>> 2) Devices under SME that do support the encryption bit (which I
>>>> believe is most graphics devices in general on SME systems, not just
>>>> amdgpu; it "just works")
>>>> *) Coherent memory is encrypted. The DMA layer sets dma addresses and
>>>> pgprot accordingly.
>>>> *) Streaming DMA is encrypted.
>>>>
>>>> So the bug in TTM would then be it's not handling 1a) and 1b) 
>>>> correctly.
>>>>
>>>> Remedy:
>>>> 1b) Shouldn't be used with encryption.
>>>> 1a) This is what we should try to fix. Exporting
>>>> dma_force_unencrypted() didn't seem to be a way forward. Properly
>>>> fixing this would, I guess, mean implement the missing functionality
>>>> in the dma layer: For vmap / kmap we could simply reuse the virtual
>>>> addresses we get back from dma_alloc_coherent(), but for faulting one
>>>> would want something like dma_coherent_insert_pfn() (if it doesn't
>>>> exist already) after a proper disussion with Christoph Hellwig.
>>> Christoph once pointed me to dma_mmap_attrs() for this, but I never
>>> found the time to fully look into it.
>> Hmm, yea, that would make sense
>> https://elixir.bootlin.com/linux/latest/source/kernel/dma/direct.c#L564
>> Replacing the vmap's with dma_mmap_attrs would probably fix this, but 
>> it would
>> require a bit of extra setup.
>>
>> So we're saying that yes, we don't want unconditional pgprot_decrypt 
>> in ttm_io_prot.
>> We'd like to leave those tt mappings as encrypted when possible and 
>> instead maybe
>> add a vaddr to ttm_tt (or extract it from the pages->private via the  
>> ttm_pool_dma,
>> but that seems rather ugly),
>
> It could probably be extracted from pages->private from a helper in 
> the ttm pool code, (Christian has a final saying here). However, that 
> requires that all ttm_tts are built from a single dma_alloc chunk. Not 
> sure that's the case? In that case we're back to square zero for vmaps.

Nope they aren't and yes we are back to square one with that.

i think when we want to make this really clean we would need to forward 
the page fault request to the ttm_pool so that it can call 
dma_mmap_attrs() or similar.

But I'm not sure if dma_mmap_attrs() is even guaranteed to work with 
reverse mapping.

>
>
>>   plus add get_vm_area() to be able to use dma_mmap_attrs
>> instead of vmap when use_dma_alloc's is true in ttm_bo_vmap/ttm_bo_kmap?
>
> vmap is for kernel mappings, and dma_mmap_attrs is for user-space 
> mappings, But dma_mmap_attrs()wouldn't be sufficient. If, for example. 
> a bo is moved from VRAM to dma-coherent memory, or vice-versa we Zap 
> the user-space page-table entries and replace them using faulting to 
> the new address. That's why we'd need a dma_coherent_insert_pfn().

Yes, exactly that.

>
> I think at the time Christoph was under the impression that you could 
> replace xxx_insert_pfn() with xxx_mmap() in the fault handler, but 
> that doesn't work since the latter requires the mmap_lock in write 
> mode, whereas the fault handler only holds it in read mode. (That 
> caused some bugs in the i915 driver when that change was attempted).

Good to know.

Christian.

>
> /Thomas
>
>
>>
>> z
>>
>>



More information about the dri-devel mailing list