[Linaro-mm-sig] [PATCH 1/3] dma-buf: Add ioctl to query mmap info

Mon Aug 8 14:56:27 UTC 2022

Am 08.08.22 um 15:26 schrieb Rob Clark:
> On Mon, Aug 8, 2022 at 4:22 AM Christian König <christian.koenig at amd.com> wrote:
>
> [SNIP]
>>>> If the virtio/virtgpu UAPI was build around the idea that this is
>>>> possible then it is most likely fundamental broken.
>>> How else can you envision mmap'ing to guest userspace working?
>> Well long story short: You can't.
>>
>> See userspace mappings are not persistent, but rather faulted in on
>> demand. The exporter is responsible for setting those up to be able to
>> add reverse tracking and so can invalidate those mappings when the
>> backing store changes.
> I think that is not actually a problem.  At least for how it works on
> arm64 but I'm almost positive x86 is similar.. I'm not sure how else
> you could virtualize mmu/iommu/etc in a way that didn't have horrible
> performance.
>
> There are two levels of pagetable translation, the first controlled by
> the host kernel, the second by the guest.  From the PoV of host
> kernel, it is just memory mapped to userspace, getting faulted in on
> demand, just as normal.  First the guest controlled translation
> triggers a fault in the guest which sets up guest mapping.  And then
> the second level of translation to translate from what guest sees as
> PA (but host sees as VA) to actual PA triggers a fault in the host.

Ok, that's calming.

At least that's not the approach talked about the last time this came up 
and it doesn't rip a massive security hole somewhere.

The question is why is the guest then not using the caching attributes 
setup by the host page tables when the translation is forwarded anyway?

> [SNIP]
> This is basically what happens, although via the two levels of pgtable
> translation.  This patch provides the missing piece, the caching
> attributes.

Yeah, but that won't work like this. See the backing store migrates all 
the time and when it is backed by PCIe/VRAM/local memory you need to use 
write combine while system memory is usually cached.

>>       Because otherwise you can't accommodate that the exporter is
>> changing those caching attributes.
> Changing the attributes dynamically isn't going to work.. or at least
> not easily.  If you had some sort of synchronous notification to host
> userspace, it could trigger an irq to the guest, I suppose.  But it
> would mean host kernel has to block waiting for host userspace to
> interrupt the guest, then wait for guest vgpu process to be scheduled
> and handle the irq.

We basically change that on every page flip on APUs and that doesn't 
sound like something fast.

Thanks for the explanation how this works,
Christian.

>
> At least in the case of msm, the cache attributes are static for the
> life of the buffer, so this scenario isn't a problem.  AFAICT this
> should work fine for at least all UMA hw.. I'm a bit less sure when it
> comes to TTM, but shouldn't you at least be able to use worst-cache
> cache attributes for buffers that are allowed to be mapped to guest?
>
> BR,
> -R
>
>>> But more seriously, let's take a step back here.. what scenarios are
>>> you seeing this being problematic for?  Then we can see how to come up
>>> with solutions.  The current situation of host userspace VMM just
>>> guessing isn't great.
>> Well "isn't great" is a complete understatement. When KVM/virtio/virtgpu
>> is doing what I guess they are doing here then that is a really major
>> security hole.
>>
>>>     And sticking our heads in the sand and
>>> pretending VMs don't exist isn't great.  So what can we do?  I can
>>> instead add a msm ioctl to return this info and solve the problem even
>>> more narrowly for a single platform.  But then the problem still
>>> remains on other platforms.
>> Well once more: This is *not* MSM specific, you just absolutely *can't
>> do that* for any driver!
>>
>> I'm just really wondering what the heck is going on here, because all of
>> this was discussed in lengthy before on the mailing list and very
>> bluntly rejected.
>>
>> Either I'm missing something (that's certainly possible) or we have a
>> strong case of somebody implementing something without thinking about
>> all the consequences.
>>
>> Regards,
>> Christian.
>>
>>
>>> Slightly implicit in this is that mapping dma-bufs to the guest won't
>>> work for anything that requires DMA_BUF_IOCTL_SYNC for coherency.. we
>>> could add a possible return value for DMA_BUF_INFO_VM_PROT indicating
>>> that the buffer does not support mapping to guest or CPU access
>>> without DMA_BUF_IOCTL_SYNC.  Then at least the VMM can fail gracefully
>>> instead of subtly.
>>>
>>> BR,
>>> -R