[virglrenderer-devel] multiprocess model and GL

Tue Jan 21 19:06:36 UTC 2020

On Fri, Jan 17, 2020 at 5:20 PM Gurchetan Singh
<gurchetansingh at chromium.org> wrote:
>
> On Fri, Jan 17, 2020 at 3:41 PM Chia-I Wu <olvaffe at gmail.com> wrote:
> >
> > On Thu, Jan 16, 2020 at 11:29 PM Gerd Hoffmann <kraxel at redhat.com> wrote:
> > >
> > > On Thu, Jan 16, 2020 at 12:33:25PM -0800, Chia-I Wu wrote:
> > > > On Thu, Jan 16, 2020 at 4:58 AM Gerd Hoffmann <kraxel at redhat.com> wrote:
> > > > >
> > > > > On Mon, Jan 13, 2020 at 01:03:22PM -0800, Chia-I Wu wrote:
> > > > > > Sorry I missed this email.
> > > > > >
> > > > > > On Thu, Jan 9, 2020 at 12:54 PM Dave Airlie <airlied at gmail.com> wrote:
> > > > > > >
> > > > > > > This is just an aside to the issue discussion, but I was wondering
> > > > > > > before we heavily consider a vulkan multi-process model, could
> > > > > > > we/should we prove a GL multi-process model first?
> > > > > >
> > > > > > I think there will just be the qemu process at first, with many GL
> > > > > > contexts and one VkInstance for each guest VkInstance.  Then there
> > > > > > will be a switch to run different VkInstance in different processes
> > > > > > (unlikely, but I wonder if this can be a vk layer).  I did not plan
> > > > > > for a multi-process GL model.  Do you have anything you want to prove
> > > > > > from it?
> > > > >
> > > > > Right now we have two models:
> > > > >   - Everything in qemu.
> > > > >   - Separate virgl process (see contrib/vhost-user-gpu/ in qemu),
> > > > >     but still all gl contexts in a single process.
> > > > >
> > > > > We could try to switch vhost-user-gpu to a one-process-per-context
> > > > > model.  I think it makes sense to at least think about the resource
> > > > > management implications this would have (it would make virgl work
> > > > > simliar to vulkan):
> > > > >
> > > > >  - We would need a master process.  It runs the virtqueues and manages
> > > > >    the resources.
> > > >
> > > > In the distant feature where we will be Vulkan-only, we will not want
> > > > GL-specific paths.  If we are to do multi-process GL now,
> > >
> > > [ Note; I don't think it buys us much to actually do that now, we have
> > >         enough to do even without that.  But we should keep that in mind
> > >         when designing things ... ]
> > >
> > > > I think we should follow the multi-process Vulkan model, in the sense
> > > > that GL resources should also be created in the per-context processes
> > >
> > > Yep, that would be good, we would not need a dma-buf for each and every
> > > resource then.  Problem here is backward compatibility.  We simply can't
> > > do that without changing the virtio protocol.
> > >
> > > So, I guess the options we have are:
> > >  (1) keep virgl mostly as-is and live with the downsides (which should
> > >      not be that much of a problem as long as one process manages all
> > >      GL contexts), or
> > >  (2) create virgl_v2, where resource management works very simliar to
> > >      the vulkan way of doing things, require the guest using that to
> > >      run gl+vk side-by-side.  Old guests without vk support could
> > >      continue to use virgl_v1
> >
> > (1) still requires defining interop with vk.  (2) seems like a
> > reasonable requirement given that both drivers will be built from
> > mesa.  But there are also APIs who like a simple interface like
> > virgl_v1 to allocate resources yet requires interop with vk.  I guess
> > both sound fine to me.
> >
> > The three resource models currently on the table are
> >
> > (A) A resource in the guest is a global driver object in the host.
> > The global driver object is usable by all contexts and qemu.
> > (B) A resource in the guest is a local driver object in the main
> > renderer process in the host.  VIRTIO_GPU_CMD_CTX_ATTACH_RESOURCE
> > creates attachments and each attachment is a local object in a
> > per-context process.  VIRTIO_GPU_CMD_SET_SCANOUT creates a local
> > object in qemu process.
> > (C) A resource in the guest is an fd in the main renderer process in
> > the host.  The fd may be created locally by the main renderer process
> > (e.g., udmabuf) or received from a per-context process.
> > sends the fd to another per-context
> > process.  VIRTIO_GPU_CMD_SET_SCANOUT works similar to in (B).
> >
> > (A) is the current model and does not support VK/GL interop.  (B) is
> > designed to be compatible with (A) and the current virtio protocol.
> > It allows multi-process GL as well as VK/GL interop, but it requires a
> > dma-buf for each resource even when not really shared.
> >
> > (C) is the Vulkan model, but it is unclear how
> > VIRTIO_GPU_CMD_RESOURCE_CREATE_3D works.  I think we can think of the
> > main process as a simple allocator as well.
> > VIRTIO_GPU_CMD_RESOURCE_CREATE_3D makes the main process allocate
> > (from GBM or GL) and create an fd , just like how the main process can
> > allocate a udmabuf.  This way this model can work with option (1).
>
> I think we should go to some variation of (C).
>
> GL (and even VK without certain extensions) resources are not
> exportable -- the main process should only keep exportable resources
> in this fd table.  That's why I'm thinking of adding the
> VIRTGPU_RESOURCE_EXPORTABLE_BIT and modifying guest Mesa accordingly
> to set it properly (essentially looking for
> PIPE_BIND_SHARED/PIPE_BIND_SCANOUT is sufficient for OpenGL).
The bit would become pointless if the host moved to a multi-process
GL, because every v1 resources must be exported in multi-process GL.
> If we get a VIRTIO_GPU_CMD_RESOURCE_CREATE_3D and QEMU advertises
> RESOURCE_V2, we'll assume it's not exportable.
This is option (2) that Gerd described.

I think we can live with option (1) and disallow multi-process GL and
GL->VK sharing for the moment (i.e., simply assume v1 resources are
not exportable).  Or we can go directly to option (2) now (still
assume v1 resources are not exportable).  Both are fine to me.  I
believe you want option (2) and I can support that.

Even with v2, resources that only have guest storage for CPU-only
access do not have an fd.  The fact that Vulkan will create an empty
resource first and attach a fd to the resource means there is a short
window when a resource does not have an fd.  What the main renderer
process maintains should be a resid-to-resource lookup table, where
the resource is something like

  struct vrend_resource_v2 {
    int fd; // can be -1 unless an fd is attached (e.g., a driver
object in a per-context process is exported and attached; or the iov
is wrapped in a udmabuf)
    struct iovec *iov; // can be NULL unless guest storage is attached

    struct vrend_resource *v1; // can be NULL unless this is a v1 resource
  };

>
>
>
>
>
> >
> >
> >
> > > > >  - We would need a per-context process.
> > > > >  - VIRTIO_GPU_CMD_CTX_ATTACH_RESOURCE would make master dma-buf export a
> > > > >    resource and the per-context process import it.  Sharing resources
> > > > >    works by calling VIRTIO_GPU_CMD_CTX_ATTACH_RESOURCE multipe times for
> > > > >    different contexts, and therefore master passing the dma-buf to
> > > > >    multiple per-context processes.
> > > >
> > > > I would like to see the export and import parts to be separated out
> > > > and executed by other commands, but all three commands can be sent
> > > > together by the guest kernel when it makes senses.  Especially the
> > > > import part, the guest vulkan wants to pass some metadata and specify
> > > > an object id for the imported driver object.
> > >
> > > I don't want call this import/export, that term is overloaded too much
> > > already.  Also the "export" is needed for more than just export.  It is
> > > needed for everything the guest needs a gem bo for (mmap, scanout, ...).
> > >
> > > I guess something along the lines of OBJECT_TO_RESOURCE (add virtio
> > > resource for vulkan object, aka "export") and RESOURCE_TO_OBJECT
> > > ("import") would be better.
> > >
> > > Does GL have object IDs too?
> > No.  A resource in the guest is already a global GL object in the
> > host.  VIRTIO_GPU_CMD_SUBMIT_3D can use the resource ids directly.
> >
> > Vulkan wants object ids because there might be no resource ids when
> > resources are not needed.  When there are resource ids, they might
> > point to fds  and importing them as objects are not trivial.  Also the
> > same resource id might be imported multiple times to create multiple
> > objects.
> >
> > >
> > > cheers,
> > >   Gerd
> > >