[RFC PATCH 00/12] Private MMIO support for private assigned dev

Xu Yilun yilun.xu at linux.intel.com
Sat May 24 03:33:25 UTC 2025


On Tue, May 20, 2025 at 08:57:42PM +1000, Alexey Kardashevskiy wrote:
> 
> 
> On 16/5/25 04:02, Xu Yilun wrote:
> > > IMHO, I think it might be helpful that you can picture out what are the
> > > minimum requirements (function/life cycle) to the current IOMMUFD TSM
> > > bind architecture:
> > > 
> > > 1.host tsm_bind (preparation) is in IOMMUFD, triggered by QEMU handling
> > > the TVM-HOST call.
> > > 2. TDI acceptance is handled in guest_request() to accept the TDI after
> > > the validation in the TVM)
> > 
> > I'll try my best to brainstorm and make a flow in ASCII.
> > 
> > (*) means new feature
> > 
> > 
> >        Guest          Guest TSM       QEMU           VFIO            IOMMUFD       host TSM          KVM
> >        -----          ---------       ----           ----            -------       --------          ---
> > 1.                                                                               *Connect(IDE)
> > 2.                                 Init vdev
> > 3.                                *create dmabuf
> > 4.                                               *export dmabuf
> > 5.                                create memslot
> > 6.                                                                                              *import dmabuf
> > 7.                                setup shared DMA
> > 8.                                                                 create hwpt
> > 9.                                               attach hwpt
> > 10.                                  kvm run
> > 11.enum shared dev
> > 12.*Connect(Bind)
> > 13.                  *GHCI Bind
> > 14.                                  *Bind
> > 15                                                                 CC viommu alloc
> > 16.                                                                vdevice allloc
> > 16.                                              *attach vdev
> 
> 
> This "attach vdev" - we are still deciding if it goes to IOMMUFD or VFIO, right?

This should be "tsm bind". Seems Jason's suggestion is place the IOCTL
against VFIO, then VFIO reach into IOMMUFD to do the real
pci_tsm_bind().

https://lore.kernel.org/all/20250515175658.GR382960@nvidia.com/

> 
> 
> > 17.                                                               *setup CC viommu
> > 18                                                                 *tsm_bind
> > 19.                                                                                  *bind
> > 20.*Attest
> > 21.               *GHCI get CC info
> > 22.                                 *get CC info
> > 23.                                                                *vdev guest req
> > 24.                                                                                 *guest req
> > 25.*Accept
> > 26.             *GHCI accept MMIO/DMA
> > 27.                                *accept MMIO/DMA
> > 28.                                                               *vdev guest req
> > 29.                                                                                 *guest req
> > 30.                                                                                              *map private MMIO
> > 31.             *GHCI start tdi
> > 32.                                *start tdi
> > 33.                                                               *vdev guest req
> > 34.                                                                                 *guest req
> 
> 
> I am not sure I follow the layout here. "start tdi" and "accept MMIO/DMA" are under "QEMU" but QEMU cannot do anything by itself and has to call VFIO or some other driver...
> 

Yes. Call IOCTL(iommufd, IOMMUFD_VDEVICE_GUEST_REQUEST, vdevice_id)

> > 35.Workload...
> > 36.*disconnect(Unbind)
> 
> Is this a case of PCI hotunplug? Or just killing QEMU/shutting down the VM? Or stopping trusting the device and switching it to untrusted mode, to work with SWIOTLB or DiscardManager?
> 

switching to untrusted mode. But I think hotunplug would finally trigger
the same host side behavior, only no need the guest to "echo 0 > connect"

> > 37.              *GHCI unbind
> > 38.                                *Unbind
> > 39.                                            *detach vdev
> > 40.                                                               *tsm_unbind
> > 41.                                                                                 *TDX stop tdi
> > 42.                                                                                 *TDX disable mmio cb
> > 43.                                            *cb dmabuf revoke
> 
> 
> ... like VFIO and hostTSM - "TDX stop tdi" and "cb dmabuf revoke" are not under QEMU.

Correct. These are TDX Module specific requirements, we don't want them
to make the general APIs unnecessary verbose.

> 
> 
> > 44.                                                                                               *unmap private MMIO
> > 45.                                                                                 *TDX disable dma cb
> > 46.                                                              *cb disable CC viommu
> > 47.                                                                                 *TDX tdi free
> > 48.                                                                                 *enable mmio
> > 49.                                            *cb dmabuf recover
> 
> 
> What is the difference between "cb dmabuf revoke" and "cb dmabuf recover"?

revoke revokes private S-EPT mapping, recover means KVM could then do
shared MMIO mapping on EPT.

Thanks,
Yilun

> 
> 
> > 50.workable shared dev
> > 
> > TSM unbind is a little verbos & specific to TDX Connect, but SEV TSM could
> > ignore these callback. Just implement an "unbind" tsm ops.
> 
> 
> Well, something need to clear RMP entries, can be done in the TDI unbind or whenever you will do it.
> 
> And the chart applies for AMD too, more or less. Thanks,
> 
> 
> > Thanks,
> > Yilun
> > 
> > > 
> > > and which part/where need to be modified in the current architecture to
> > > reach there. Try to fold vendor-specific knowledge as much as possible,
> > > but still keep them modular in the TSM driver and let's see how it looks
> > > like. Maybe some example TSM driver code to demonstrate together with
> > > VFIO dma-buf patch.
> > > 
> > > If some where is extremely hacky in the TSM driver, let's see how they
> > > can be lift to the upper level or the upper call passes more parameters
> > > to them.
> 
> 
> 
> -- 
> Alexey
> 


More information about the dri-devel mailing list