GEM allocation for para-virtualized DRM driver

Mon Mar 20 18:52:52 UTC 2017

On Mon, Mar 20, 2017 at 2:25 PM, Oleksandr Andrushchenko
<andr2000 at gmail.com> wrote:
> On 03/20/2017 08:17 PM, Rob Clark wrote:
>>
>> On Mon, Mar 20, 2017 at 2:01 PM, Oleksandr Andrushchenko
>> <andr2000 at gmail.com> wrote:
>>>
>>> On 03/20/2017 07:38 PM, Rob Clark wrote:
>>>>
>>>> On Mon, Mar 20, 2017 at 1:18 PM, Oleksandr Andrushchenko
>>>> <andr2000 at gmail.com> wrote:
>>>>>
>>>>>
>>>>> On 03/18/2017 02:22 PM, Rob Clark wrote:
>>>>>>
>>>>>> On Fri, Mar 17, 2017 at 1:39 PM, Oleksandr Andrushchenko
>>>>>> <andr2000 at gmail.com> wrote:
>>>>>>>
>>>>>>> Hello,
>>>>>>> I am writing a para-virtualized DRM driver for Xen hypervisor
>>>>>>> and it now works with DRM CMA helpers, but I would also like
>>>>>>> to make it work with non-contigous memory: virtual machine
>>>>>>> that the driver runs in can't guarantee that CMA is actually
>>>>>>> physically contigous (that is not a problem because of IPMMU
>>>>>>> and other means, the only constraint I have is that I cannot mmap
>>>>>>> with pgprot == noncached). So, I am planning to use
>>>>>>> *drm_gem_get_pages*
>>>>>>> +
>>>>>>> *shmem_read_mapping_page_gfp* to allocate memory for GEM objects
>>>>>>> (scanout buffers + dma-bufs shared with virtual GPU)
>>>>>>>
>>>>>>> Do you think this is the right approach to take?
>>>>>>
>>>>>> I guess if you had some case where you needed to "migrate" buffers
>>>>>> between host and guest memory, then TTM might be useful.  Otherwise
>>>>>> this sounds like the right approach.
>>>>>
>>>>> Tried that today (drm_gem_get_pages), the result is interesting:
>>>>>
>>>>> 1. modetest
>>>>> 1.1. Runs, I can see page flips
>>>>> 1.2. vm_operations_struct.fault is called, I can vm_insert_page
>>>>>
>>>>> 2. kmscube (Rob, thanks for that :) + PowerVR SGX 6250
>>>>> 2.1. Cannot initialize EGL
>>>>> 2.2. vm_operations_struct.fault is NOT called
>>>>
>>>> jfwiw, pages will only get faulted in when CPU accesses them..
>>>
>>> indeed, good catch
>>>>
>>>> modetest "renders" the frame on the CPU but kmscube does it on gpu.
>>>
>>> yes, I have already learned that modetest only renders once and
>>> then just flips
>>>>
>>>> So not seeing vm_operations_struct.fault is normal.  The EGL fail is
>>>> not..
>>>>
>>>>> In both cases 2 dumbs are created and successfully mmaped,
>>>>> in case of kmscube there are also handle_to_fd IOCTLs issued
>>>>> and no DRM errors observed. No DMA-BUF mmap attempt seen
>>>>>
>>>>> I re-checked 2) with alloc_pages + remap_pfn_range and it works
>>>>> (it cannot unmap cleanly, but it could be because I didn't call
>>>>> split_pages after alloc_pages), thus the setup is still good
>>>>>
>>>>> Can it be that the buffer allocated with drm_gem_get_pages
>>>>> doesn't suit PowerVR for some reason?
>>>>
>>>> I've no idea what the state of things is w/ pvr as far as gbm support
>>>> (not required/used by modetest, but anything that uses the gpu on
>>>> "bare metal" needs it).  Or what the state of dmabuf-import is with
>>>> pvr.
>>>
>>> Do you think there could be DMA related problems with
>>> the buffer allocated with drm_gem_get_pages and DMA mapping,
>>> use? So GPU is not able to handle those?
>>>
>>> The only source of knowledge at the moment I have is
>>> publicly available pvrsrvkm kernel module. But there are
>>> other unknowns, e.g. user-space libraries, firmware which
>>> are in binary form: thus kernel driver is mostly a bridge
>>> between FW and libs. That being said, do you think I have to get
>>> deeper into GPU use-case or should I switch back to alloc_pages+
>>> remap_pfn_range? ;)
>>
>> so, I suppose with pvr there is a whole host of potential pain... *but*..
>>
>> if alloc_pages path actually works, then perhaps the issue is the
>> deferred allocation.  Ie. most drivers don't drm_gem_get_pages() until
>> the buffer is passed to hw or until it is faulted in.  You should make
>> sure it ends up getting called (if it hasn't been called already)
>> somewhere in gem_prime_pin.
>
> I call drm_gem_get_pages as part of dumb creation, because I
> need to pass the pages to the host OS. So, probably, this is not
> because of the late allocation, but something else

hmm, well all the pvr gpu's that I've had to deal with in the past
have MMUs, so there shouldn't be any specific issue with where the
pages come from.  But I guess you have to poke around the kernel
module to see where things go wrong with dmabuf import (or if it even
gets that far)

BR,
-R

>>
>> BR,
>> -R
>
> Thank you!