[virglrenderer-devel] coherent memory access for virgl
Tomeu Vizoso
tomeu.vizoso at collabora.com
Thu Oct 4 07:04:42 UTC 2018
On 10/3/18 4:46 PM, Tomeu Vizoso wrote:
> On 9/28/18 5:48 AM, Gurchetan Singh wrote:
>> On Thu, Sep 27, 2018 at 12:04 AM Gerd Hoffmann <kraxel at redhat.com> wrote:
>>>
>>>>>> Right now, for most
>>>>>> textures, virglrenderer copies iovecs into a temporary buffer (see
>>>>>> read_transfer_data), and then calls glTexSubImage2D*.
>>>>>
>>>>> Is virglrenderer clever enough to skip the temporary buffer copy in
>>>>> case
>>>>> it finds niov == 1 ?
>>>>
>>>> There is a fast-path in read_transfer_data / write_transfer_data
>>>> depending on the send_size and various other parameters, but in my gdb
>>>> experience it's not used most of the time.
>>>
>>> Looking at the code vrend_renderer_transfer_write_iov() seems to not
>>> call read_transfer_data() in the first place in case num_iovs == 1.
>>>
>>> So, qemu could just pass in a iov with one element, and things would
>>> improve with current virglrenderer versions.
>>
>> If that's possible, that'd be great.
>>
>>> Newer virglrenderer
>>> versions could consume dmabuf handle and mapping pointer instead (and
>>> import the dmabuf if possible).
>>>
>>>> However, such cases are prominent in the Android / ChromeOS display
>>>> stacks (and often mapped in the guest), so that's why I'm interested
>>>> in making them backed by host memory and display/GPU optimized. We'll
>>>> need a way of expressing modifiers to the guest, so this delves into
>>>> the earlier discussion of wayland host proxying. Who should allocate
>>>> -- the host compositor, the VMM, virglrenderer? The host compositor
>>>> seems like the most natural choice.
>>>
>>> Why the host compositor? Normal wayland clients don't ask the host
>>> compositor for buffers either, right? They do egl rendering using
>>> render nodes, export the front buffer as dmabuf and pass them to the
>>> compositor for rendering ...
>>>
>>> I think virglrenderer should allocate the buffers.
>>
>> virglrenderer allocating the buffers should work for now. There was
>> some discussion on using modifiers for v4l2, but that didn't go very
>> far:
>>
>> https://lists.freedesktop.org/archives/dri-devel/2017-August/150850.html
>>
>> Modifiers are designed to have multiple consumer apis, but I've only
>> seen EGL + KMS implementations.
>>
>> If virglrenderer will do the allocation, what about
>> virtio_gpu_resource_create_2d -- who calls that in guest userspace?
>> Should it ever be host-optimized (since we're essentially talking
>> about single-level 2D textures/render targets/scan-out buffers)?
>>
>>>
>>>> Are there any plans of the guest using host-optimized buffers
>>>> (communicated via modifiers) in a purely Linux guest?
>>>
>>> I think that would imply the virgl mesa driver must be able to handle
>>> pretty much any vendors compressed/tiled buffer format. Hmm, no idea
>>> how difficuilt that would be.
>>
>> It could actually be pretty easy, due the Gallium abstraction. We
>> need give Gallium a linear view into the texture -- which we can
>> always fallback to GL to do. Other items include:
>>
>> 1) Fix the resource info ioctl to actually return the stride + format
>> modifiers
>> 2) Expose the modifiers the host supports (via
>> eglQueryDmaBufModifiersEXT) in the guest. The applicability of this
>> depends on the userspace (Android won't use this).
>>
>>>>>> But making host memory guest visible will bring the worst-case buffer
>>>>>> copies from 3 to 1. For textures, if we start counting when the GPU
>>>>>> buffer gets detiled, there will be 5 copies currently, 3 with udmabuf,
>>>>>> and 1 with host exposed memory.
>>>>>
>>>>> 5 copies? verbose please. I can see three:
>>>>
>>>> It depends on when you start counting. I started counting from
>>>> vrend_renderer_transfer_send_iov, which includes fetching the data
>>>> from the host and packing that data into the iovecs. Probably a worst
>>>> case scenario.
>>>
>>> Ah, you talk about the host -> guest path, not guest -> host (or both?).
>>>
>>> Is host -> guest transfer used that much? I'd expect the guest just
>>> asks the host to display the rendered result instead of reading it back.
>>
>> Both. Not sure how common it is in Linux, but Android maps/unmaps YUV
>> buffers quite a bit, which are later used by GL.
>>
>>>
>>>>>> ii) virtio_gpu_resource_create_3d -- may or may not be host backed
>>>>>> (depends on the PCI bar size, platform-specific information -- guest
>>>>>> doesn't need to know)
>>>>>
>>>>> Hmm? How can this work in a way which is transparent for the guest?
>>>>
>>>> We already need to extend the DRM_VIRTGPU_RESOURCE_INFO ioctl, since
>>>> it doesn't return the stride and doesn't work for YUV buffers (see
>>>> crrev.com/c/1208591). Maybe we can also add a bitmask, which we can
>>>> populate with memory info (i.e, HOST_BIT | COHERENT_BIT)?
>>>
>>> Well, for userspace it can be transparent. Userspace will just call
>>> mmap() and the kernel will sort things transparently depending on the
>>> buffer allocation (userspace knowing how buffers are allocated is
>>> probably useful nevertheless).
>>>
>>> I was thinking about the kernel / vmm interface. The virtio-gpu kms
>>> driver certainly needs to know about the buffer allocation ...
>>
>> The KMS part will be more difficult than the EGL part.
>>
>> For example, on some ARM devices, AFBC can be only used on the (host)
>> primary KMS plane. If a video running in QEMU is full screen, it's
>> advantageous to allocate an AFBC buffer and then scan-it out. But if
>> the QEMU window becomes smaller, the best option is to use a linear
>> strided buffer and schedule that as an overlay. But the guest always
>> thinks it's fullscreen ...
>>
>> How is the guest currently notified about size changes of it's drawing
>> target? Do buffers get re-allocated?
>>
>> Previously (see slide 25 of
>> https://www.x.org/wiki/Events/XDC2017/widawsky_fb_modifiers.pdf),
>> there was discussion about the compositor sending supported modifiers
>> to the client (QEMU) through some sort of protocol. Does the
>> compositor notify the client of modifier changes if window size
>> changes? Perhaps wayland experts (Tomeu?) know.
>
> My understanding is that with wl_dmabuf, the buffers are allocated by the
> client. So it can decide whether to use a modifier or not, and on surface
> size changes it can allocate a different one. But TBH, I haven't checked.
Daniel pointed me to a plan for the compositor to give additional
information to clients that they can use to better decide the exact
format and modifiers of their buffers:
https://gitlab.freedesktop.org/wayland/wayland/issues/59
Cheers,
Tomeu
More information about the virglrenderer-devel
mailing list