[virglrenderer-devel] Proposals for virtio-gpu and virglrenderer

Wed May 9 22:32:25 UTC 2018

On Thu 03 May 2018, Dave Airlie wrote:
> On 3 May 2018 at 09:21, Stéphane Marchesin <marcheu at chromium.org> wrote:
> > On Wed, May 2, 2018 at 1:18 AM Gerd Hoffmann <kraxel at redhat.com> wrote:

I'm trying to catch up with conversation. Apologies if I'm missing
context.

...

> >> I still think we should try to do it the other way around:  Not map the
> >> host ressources into the guest, but make the guest resources usable by
> >> the host.  Have a little helper driver which can turn the virtio-gpu
> >> ressource iov into a dmabuf (standalone or pimped up vgem), then run
> >> with that, to avoid the memcpy.  Possibly the gpu driver still has to
> >> copy stuff then, or the gpu copy itself via dma, but the gpu driver
> >> should be able to do whatever is best for the given host hardware ...
> >
> > The GPU driver doesn't necessarily have to do a copy; so why not work
> > towards a design which can avoid these copies? I have measured that about
> > half the time of virgl texture uploads is spent on iovec copies, so we
> > could double the texture upload rate easily with such a mechanism. In
> > particular, newer APIs like Vulkan on the host side can provide coherent
> > memory which is usually just a direct view into the actual memory (and
> > therefore there is no copy). This could easily be used as a host-side
> > mechanism to lay this functionality on top of.
> >
> 
> I think for most things it does, tiling really matters, and we are never
> going to get tiling inside the guest that is useful.
> 
> Even for vulkan we have to do linear buffer->image uploads for most
> apps.

I can think of two cases where an app may not upload a linear image to
a tiled image, but would still need fast, zero-copy usage of the linear
image.

    1. The app accumulates handwriting, rendered by the CPU, into
       a linear mmapped VkImage. The app wants low-latency presentation
       of the handwriting, and the app is continually rendering the
       handwriting into the mmapped image, so it never copies the linear
       image to tiled. It samples directly from the mmapped image
       elsewhere in the pipeline. A copy here could ruin the low-latency
       expectations.

    2. Similar to 1, but the app software-decodes a media stream into
       the mmapped VkImage. The app samples directly from the mmapped
       image, applying effects or annotations to the video.

I believe there exist similar cases lurking somewhere that use tiled
VkImages with external VkDeviceMemory, but I don't know if those cases
affect the current discussion.