[RFC PATCH 08/12] vfio/pci: Create host unaccessible dma-buf for private device
Xu Yilun
yilun.xu at linux.intel.com
Thu Jan 9 16:40:28 UTC 2025
On Thu, Jan 09, 2025 at 10:40:51AM -0400, Jason Gunthorpe wrote:
> On Thu, Jan 09, 2025 at 12:57:58AM +0800, Xu Yilun wrote:
> > On Wed, Jan 08, 2025 at 09:30:26AM -0400, Jason Gunthorpe wrote:
> > > On Tue, Jan 07, 2025 at 10:27:15PM +0800, Xu Yilun wrote:
> > > > Add a flag for ioctl(VFIO_DEVICE_BIND_IOMMUFD) to mark a device as
> > > > for private assignment. For these private assigned devices, disallow
> > > > host accessing their MMIO resources.
> > >
> > > Why? Shouldn't the VMM simply not call mmap? Why does the kernel have
> > > to enforce this?
> >
> > MM.. maybe I should not say 'host', instead 'userspace'.
> >
> > I think the kernel part VMM (KVM) has the responsibility to enforce the
> > correct behavior of the userspace part VMM (QEMU). QEMU has no way to
> > touch private memory/MMIO intentionally or accidently. IIUC that's one
> > of the initiative guest_memfd is introduced for private memory. Private
> > MMIO follows.
>
> Okay, but then why is it a flag like that? I'm expecting a much
This flag is a prerequisite for setting up TDI, or part of the
requirement to make a "TDI capable" assigned device. It prevents the
userspace mapping at the first place, even as a shared device.
We want the device firstly appear as a shared device in CoCo-VM, then
do TDI setup (via a tsm verb "bind"). This late bind approach avoids
changing the CoCo VM startup routine. In contrast, early bind would
easily be broken, especially if bios is not aware of the TDI rule.
So then we face with the shared <-> private device conversion in CoCo VM,
and in turn shared <-> private MMIO conversion. MMIO region has only one
physical backend so it is a bit like in-place conversion which is
complicated. I wanna simply the MMIO conversion routine based on the fact
that VMM never needs to access assigned MMIO for feature emulation, so
always disallow userspace MMIO mapping during the whole lifecycle. That's
why the flag is introduced.
Patch 6 has similar discription.
> broader system here to make the VFIO device into a confidential device
> (like setup the TDI) where we'd have to enforce the private things,
I plan to introduce a new VFIO ioctl to setup the TDI.
> communicate with some secure world to assign it, and so on.
Yes, the new VFIO ioctl will communicate with PCI TSM.
>
> I want to see a fuller solution to the CC problem in VFIO before we
MM.. I have something but need more preparation. Whether send out or
make a public repo, I'll discuss with internal.
> can be sure what is the correct UAPI. In other words, make the
> VFIO device into a CC device should also prevent mmaping it and so on.
My idea is prevent mmaping first, then allow VFIO device into CC dev (TDI).
>
> So, I would take this out and defer VFIO enforcment to a series which
> does fuller CC enablement of VFIO.
>
> The precursor work should just be avoiding requiring a VMA when
> installing VFIO MMIO into the KVM and IOMMU stage 2 mappings. Ie by
> using a FD to get the CPU pfns into iommufd and kvm as you are
> showing.
>
> This works just fine for non-CC devices anyhow and is the necessary
Yes. It carries out the idea of "KVM maps MMIO resources without firstly
mapping into the host" even for normal VM. That's why I think it could
be an independent patchset.
Thanks,
Yilun
> building block for making a TDI interface in VFIO.
>
> Jason
More information about the dri-devel
mailing list