[PATCH 1/2] drm/vgem: Do not allocate backing shmemfs file for an import dmabuf object

Thu Jul 9 08:48:08 UTC 2020

Am 08.07.20 um 18:19 schrieb Daniel Vetter:
> On Wed, Jul 8, 2020 at 6:11 PM Daniel Vetter <daniel at ffwll.ch> wrote:
>> On Wed, Jul 8, 2020 at 5:05 PM Christian König <christian.koenig at amd.com> wrote:
>>> Am 08.07.20 um 17:01 schrieb Daniel Vetter:
>>>> On Wed, Jul 8, 2020 at 4:37 PM Christian König <christian.koenig at amd.com> wrote:
>>>>> Am 08.07.20 um 11:54 schrieb Daniel Vetter:
>>>>>> On Wed, Jul 08, 2020 at 11:22:00AM +0200, Christian König wrote:
>>>>>>> Am 07.07.20 um 20:35 schrieb Chris Wilson:
>>>>>>>> Quoting lepton (2020-07-07 19:17:51)
>>>>>>>>> On Tue, Jul 7, 2020 at 10:20 AM Chris Wilson <chris at chris-wilson.co.uk> wrote:
>>>>>>>>>> Quoting lepton (2020-07-07 18:05:21)
>>>>>>>>>>> On Tue, Jul 7, 2020 at 9:00 AM Chris Wilson <chris at chris-wilson.co.uk> wrote:
>>>>>>>>>>>> If we assign obj->filp, we believe that the create vgem bo is native and
>>>>>>>>>>>> allow direct operations like mmap() assuming it behaves as backed by a
>>>>>>>>>>>> shmemfs inode. When imported from a dmabuf, the obj->pages are
>>>>>>>>>>>> not always meaningful and the shmemfs backing store misleading.
>>>>>>>>>>>>
>>>>>>>>>>>> Note, that regular mmap access to a vgem bo is via the dumb buffer API,
>>>>>>>>>>>> and that rejects attempts to mmap an imported dmabuf,
>>>>>>>>>>> What do you mean by "regular mmap access" here?  It looks like vgem is
>>>>>>>>>>> using vgem_gem_dumb_map as .dumb_map_offset callback then it doesn't call
>>>>>>>>>>> drm_gem_dumb_map_offset
>>>>>>>>>> As I too found out, and so had to correct my story telling.
>>>>>>>>>>
>>>>>>>>>> By regular mmap() access I mean mmap on the vgem bo [via the dumb buffer
>>>>>>>>>> API] as opposed to mmap() via an exported dma-buf fd. I had to look at
>>>>>>>>>> igt to see how it was being used.
>>>>>>>>> Now it seems your fix is to disable "regular mmap" on imported dma buf
>>>>>>>>> for vgem. I am not really a graphic guy, but then the api looks like:
>>>>>>>>> for a gem handle, user space has to guess to find out the way to mmap
>>>>>>>>> it. If user space guess wrong, then it will fail to mmap. Is this the
>>>>>>>>> expected way
>>>>>>>>> for people to handle gpu buffer?
>>>>>>>> You either have a dumb buffer handle, or a dma-buf fd. If you have the
>>>>>>>> handle, you have to use the dumb buffer API, there's no other way to
>>>>>>>> mmap it. If you have the dma-buf fd, you should mmap it directly. Those
>>>>>>>> two are clear.
>>>>>>>>
>>>>>>>> It's when you import the dma-buf into vgem and create a handle out of
>>>>>>>> it, that's when the handle is no longer first class and certain uAPI
>>>>>>>> [the dumb buffer API in particular] fail.
>>>>>>>>
>>>>>>>> It's not brilliant, as you say, it requires the user to remember the
>>>>>>>> difference between the handles, but at the same time it does prevent
>>>>>>>> them falling into coherency traps by forcing them to use the right
>>>>>>>> driver to handle the object, and have to consider the additional ioctls
>>>>>>>> that go along with that access.
>>>>>>> Yes, Chris is right. Mapping DMA-buf through the mmap() APIs of an importer
>>>>>>> is illegal.
>>>>>>>
>>>>>>> What we could maybe try to do is to redirect this mmap() API call on the
>>>>>>> importer to the exporter, but I'm pretty sure that the fs layer wouldn't
>>>>>>> like that without changes.
>>>>>> We already do that, there's a full helper-ified path from I think shmem
>>>>>> helpers through prime helpers to forward this all. Including handling
>>>>>> buffer offsets and all the other lolz back&forth.
>>>>> Oh, that most likely won't work correctly with unpinned DMA-bufs and
>>>>> needs to be avoided.
>>>>>
>>>>> Each file descriptor is associated with an struct address_space. And
>>>>> when you mmap() through the importer by redirecting the system call to
>>>>> the exporter you end up with the wrong struct address_space in your VMA.
>>>>>
>>>>> That in turn can go up easily in flames when the exporter tries to
>>>>> invalidate the CPU mappings for its DMA-buf while moving it.
>>>>>
>>>>> Where are we doing this? My last status was that this is forbidden.
>>>> Hm I thought we're doing all that already, but looking at the code
>>>> again we're only doing this when opening a new drm fd or dma-buf fd.
>>>> So the right file->f_mapping is always set at file creation time.
>>>>
>>>> And we indeed don't frob this more when going another indirection ...
>>>> Maybe we screwed up something somewhere :-/
>>>>
>>>> Also I thought the mapping is only taken after the vma is instatiated,
>>>> otherwise the tricks we're playing with dma-buf already wouldn't work:
>>>> dma-buf has the buffer always at offset 0, whereas gem drm_fd mmap has
>>>> it somewhere else. We already adjust vma->vm_pgoff, so I'm wondering
>>>> whether we could adjust vm_file too. Or is that the thing that's
>>>> forbidden?
>>> Yes, exactly. Modifying vm_pgoff is harmless, tons of code does that.
>>>
>>> But changing vma->vm_file, that's something I haven't seen before and
>>> most likely could blow up badly.
>> Ok, I read the shmem helpers again, I think those are the only ones
>> which do the importer mmap -> dma_buf_mmap() forwarding, and hence
>> break stuff all around here.
>>
>> They also remove the vma->vm_pgoff offset, which means
>> unmap_mapping_range wont work anyway. I guess for drivers which use
>> shmem helpers the hard assumption is that a) can't use p2p dma and b)
>> pin everything into system memory.
>>
>> So not a problem. But something to keep in mind. I'll try to do a
>> kerneldoc patch to note this somewhere. btw on that, did the
>> timeline/syncobj documentation patch land by now? Just trying to make
>> sure that doesn't get lost for another few months or so :-/
> Ok, so maybe it is a problem. Because there is a drm_gem_shmem_purge()
> which uses unmap_mapping_range underneath, and that's used by
> panfrost. And panfrost also uses the mmap helper. Kinda wondering
> whether we broke some stuff here, or whether the reverse map is
> installed before we touch vma->vm_pgoff.

I think the key problem here is that unmap_mapping_range() doesn't blow 
up immediately when this is wrong.

E.g. we just have a stale CPU page table entry which allows userspace to 
write to freed up memory, but we don't really notice that immediately....

Maybe we should stop allowing to mmap() DMA-buf through the importer 
file descriptor altogether and only allow mapping it through its own fd 
or the exporter.

Christian.

> panfrost folks, does panfrost purged buffer handling of mmap still
> work correctly? Do you have some kind of igt or similar for this?
>
> Cheers, Daniel
>
>> Cheers, Daniel
>>
>>> Christian.
>>>
>>>> -Daniel
>>>>
>>>>> Christian.
>>>>>
>>>>>> Of course there's still the problem that many drivers don't forward the
>>>>>> cache coherency calls for begin/end cpu access, so in a bunch of cases
>>>>>> you'll cache cacheline dirt soup. But that's kinda standard procedure for
>>>>>> dma-buf :-P
>>>>>>
>>>>>> But yeah trying to handle the mmap as an importer, bypassing the export:
>>>>>> nope. The one exception is if you have some kind of fancy gart with
>>>>>> cpu-visible pci bar (like at least integrated intel gpus have). But in
>>>>>> that case the mmap very much looks&acts like device access in every way.
>>>>>>
>>>>>> Cheers, Daniel
>>>>>>
>>>>>>> Regards,
>>>>>>> Christian.
>>>>>>>
>>>>>>>
>>>>>>>> -Chris
>>
>> --
>> Daniel Vetter
>> Software Engineer, Intel Corporation
>> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&data=02%7C01%7Cchristian.koenig%40amd.com%7Ca4429cf3610248b1122f08d8235ac32a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637298220041879219&sdata=DoNpTWtuKAfiwqUdYw7INhajhH1rvzSncDivXWkv%2FDI%3D&reserved=0
>
>