[virglrenderer-devel] multiprocess model and GL

Wed Feb 5 04:14:11 UTC 2020

On Tue, Feb 4, 2020 at 7:17 PM Gurchetan Singh
<gurchetansingh at chromium.org> wrote:
>
> On Tue, Feb 4, 2020 at 1:04 PM Chia-I Wu <olvaffe at gmail.com> wrote:
> >
> > On Mon, Feb 3, 2020 at 7:20 PM Gurchetan Singh
> > <gurchetansingh at chromium.org> wrote:
> > >
> > > On Fri, Jan 31, 2020 at 7:04 PM Chia-I Wu <olvaffe at gmail.com> wrote:
> > > > > So I wouldn't call the resource create v2 protocol as execbuffer.
> > > > I will say virgl_hw.h is a part of the virgl_protocol.h.  Together,
> > > > they define the virgl protocol.  If you look at the various _FORMAT
> > > > fields in virgl_protocol.h, they refers to the formats defined in
> > > > virgl_hw.h.
> > > >
> > >
> > > Disagree here -- virgl_hw.h is the source of truth.  It's possible to
> > > create a resource without including virgl_protocol.h, and we do it all
> > > the time.
> > >
> > > For example, without VIRGL_BIND_STAGING or
> > > virgl_supported_format_mask, the guest can't know which type of
> > > resources can be created.
> > >
> > > Without VIRGL_CAP_COMPUTE_SHADER, the guest can't know something like
> > > VIRGL_SET_SHADER_BUFFER_* is supported.  All defined in virgl_hw.h
> >
> > How are VIRGL_BIND_STAGING or VIRGL_CAP_COMPUTE_SHADER applied to virv
> > resources or execbuffer?  Will virgl_hw.h be extended in a way that it
> > provides unified caps and resource model that both virgl and virv
> > share?  Or will virgl_hw.h be extended in a way that virgl and virv
> > have very little in common?  If there were going to be two worlds
> > coexist in the extended virgl_hw.h, I wouldn't call it the source of
> > truth (two sources of two truths maybe?)
>
> Multiple sources are perfectly valid.  The virglrenderer protocol is a
> superset of all capset defined protocols.  capset 1 and capset 2
> (virgl_hw.h) only apply to virgl (context type 0).
>
> capset 3 version 1 can apply to context type 0
> capset 3 version 2 can apply to context type 1
> capset 4 version 1 can apply to context type 1
>
> Each capability defines its own protocol.  We can drop support for
> capset 1 and capset 2 once OpenGL is deprecated.
It sounds like there are 1 resource creation protocol (virgl_hw.h) and
1 execbuffer protocol (virgl_protocol.h) for virgl support.  And you
want 2 resource creation protocols and 2 execbuffer protocols in total
after we add virv support, which are not what I had in mind when you
say extending....

>
> >
> >
> > > > > I think a more fair way to put this is that some of the virgl protocol
> > > > details are leaked by drm_virtgpu_resource_create.  IMO, a reasonable
> > > > way to plug the leak is to make sure the virgl protocol details are
> > > > contained in EXECBUFFER in resource create v2.
> > >
> > > I would put this way: both resource creation and the execbuffer
> > > protocol are defined by the capabilities, and the virglrenderer
> > > protocol is the superset of all capability-defined protocols.
> > >
> > > Honestly, the existing interface isn't too bad.  We just need to extend it ...
.... My view of the existing interface is that the virgl execbuffer
protocol (virgl_protocol.h) is good, but the resource creation
protocol (virgl_hw.h) is virgl-specific.  To extend them for virv, I
introduce a new command to the resource creation protocol

  RESOURCE_CREATE_v2(ctx_id, obj_id)

This will be the core resource creation command going forward. It can
create resources from context-local objects.  Once virgl_protocol.h is
extended to support local allocations, it can be used with
virgl_protocol.h to replace the v1 resource creation command.

The new command is independent from execbuffer protocols.  When virv
execbuffer protocol is added, the core resource command can be used
with virv execbuffer protocol automatically.  There will be 1 resource
creation protocol and 2 execbuffer protocols.

> > >
> > > [snip]
> > > > I think that gives a very pure resource model and resource API, but
> > > > for no practical benefit other than pureness.  Keeping
> > > > VIRTGPU_RESOURCE_FLAG_ALLOC_MASK and adding
> > > > VIRTGPU_RESOURCE_FLAG_ALLOC_CLASSIC (and adding
> > > > drm_virtgpu_resource_create to the union) seem like a good compromise
> > > > to me.  The guest userspace can use the classic alloc type to allocate
> > > > a "classic" host storage.  The host can choose to allocate using gbm
> > > > or other apis.
> > >
> > > A pure resource model and API is a good thing!  Honestly, I'd prefer
> > > we not bikeshed over style too much.  We have an implementation that
> > > doesn't require something like ALLOC_CLASSIC but solves the same
> > > problem, so why not go with that?
> >
> > It has something like VIRTGPU_RESOURCE_FLAG_ALLOC_ARGS and leaves
> > everything to the userspace (to fill out args[]).  In fact, the alloc
> > type can be encoded in the first byte of args[] and there will be no
> > need for VIRTGPU_RESOURCE_FLAG_ALLOC_MASK.  I won't miss any
> > functionality either because I can, for example, encode
> > VIRTGPU_RESOURCE_FLAG_ALLOC_EXECBUFFER in args[] first, encode object
> > id next, and then encode the command stream.
>
> Yeah, if you want to make the one protocol (caps 3, v2) reference
> another protocol (caps 4), that's possible (if they're both
> supported).
>
> >
> > Like I said, I am not entirely against args[] as long as it supports
> > my use cases (and it does).  What I am not sure about is how wild do
> > we allow args[] to be?  The kernel does not care so the userspace can
> > do whatever it wants?
>
> Complex rendering commands while creating a virtio-gpu resource are
> not desired.  It's reasonable to limit the maximum size of the
> resource create protocol to something like 64 bytes.
>
> > That is one extreme.  The other extreme is to
> > say that the kernel only accepts execbuffer so args[] is actually
> > cmds[]+object_id.  That may seem limiting, but that could work too
> > because you could put whatever would go into args[] into cmds[].
> >
> > Why not go with args[]?  There is cmds[] that the userspace can do
> > whatever it wants already, including allocations.  Why add another
> > mechanism that is allocation-only?
>
> Because the new ioctl is related to resource creation.  It makes sense
> to limit it to that purpose.  We have the capabilites -- let's use
> them.

I see driver allocation and resource creation (RESOURCE_CREATE_v2
above) as independent steps.  The fact that it is possible to do both
in one ioctl is a kernel trick to reduce context switches.  I also
have use cases where I will do execbuffer and resource create v2
separately.

Caps can be used to decide whether virgl or virv are supported, or
whether virgl supports local allocations, or the resource creation
protocol supports RESOURCE_CREATE_v2 above.  What do you mean by not
using them?

>
> > Separating args[] out from cmds[]
> > gives some more flexibility, but I am not sure if the flexibility is
> > justified.
>
> There are many workflows and use cases.  Flexibility and automatic
> versioning is good to have.  Using EXECBUFFER for virgl is not ideal,
> and since it's only called when glFlush() / glFinish() by Mesa, it
> really currently models fenced command buffer submission:
>
> https://github.com/torvalds/linux/blob/master/include/uapi/drm/i915_drm.h#L939
> https://github.com/torvalds/linux/blob/master/include/uapi/drm/i915_drm.h#L1049
> https://github.com/torvalds/linux/blob/master/include/uapi/drm/msm_drm.h#L180
> https://github.com/torvalds/linux/blob/master/include/uapi/drm/msm_drm.h#L231
> https://github.com/torvalds/linux/blob/master/include/uapi/drm/panfrost_drm.h#L49
> https://github.com/torvalds/linux/blob/master/include/uapi/drm/vc4_drm.h#L91
> ....
>
> We're trying to emulate GEM_CREATE...
GEM_CREATE is emulated with RESOURCE_CREATE_v2 above which is pretty
simple and straightforward.  EXECBUFFER is an extra step inserted
before GEM_CREATE due to the fact that the host requires vastly
different parameters to allocate depending on the driver APIs.

For comparison, GEM_CREATE is also emulated with RESOURCE_CREATE v1.
It does not need the extra step because it is kind of virgl-specific
whose allocation parameters are known and fixed.

>
>
> >
> > > [snip]
> > > > The main thing that model misses is that VK renderers want to make
> > > > local allocations when they are not to be mapped nor shared.  If that
> > > > model did allow local allocations, would VK renderers use objid to
> > > > identify local allocations and use resid to identify resources?
> > >
> > > Local allocations are fine.  Using DRM_IOCTL_VIRTGPU_EXECBUFFER to
> > > allocate and then doing DRM_VIRTGPU_RESOURCE_CREATE_BLOB with
> > > something like VIRTGPU_STRUCTURE_TYPE_BIND_OBJECT_ID with the variadic
> > > arguments when needed should work, no?
> > Yes.  I was responding to Dave about the classic model missing the
> > need for local allocations.  I believe args[] gets it and all other
> > needs covered.
> >
> >
> > > [snip]
> > > > That model also encourages implicit exports/imports.  It does not want the
> > > > guest to know that the centralized allocator might export and the
> > > > renderers might import.  But when the renderers do not have enough
> > > > info to import, or when the renderers want to associate an objid with
> > > > a resid, they rely on the guest to explicitly import again.
> > >
> > > There's many ways to import/export.  I pointed a way using
> > >
> > > VIRTIO_GPU_CMD_CTX_ATTACH_RESOURCE
> > > DMA_BUF_SET_NAME/ DMA_BUF_GET_NAME
> > > MODIFY_INCOMING_IMPORT/MODIFY_INCOMING_EXPORT
> > >
> > > earlier in this thread.  I think we should hold off that discussion
> > > because the proposed model doesn't necessarily require implicit
> > > export/import, though IMO it's not a huge issue.
> > >
> > > [snip]
> > > > I think Gurchetan started from that model and fixed all the issues.
> > > > As a result, the allocation parameters are in a new opaque args[]
> > > > rather than in the existing opaque cmd[].  I like cmd[] more because
> > > >
> > > >  - I use execbuffer in VK renderers to make local allocations and
> > > > import/export resources already, and
> > > >  - I did not start from that model; instead I view execbuffer as the
> > > > way to pass opaque parameters
> > >
> > > It makes sense to continue building on the current model, where the
> > > execbuffer protocol is defined by DRM_IOCTL_VIRTGPU_EXECBUFFER, and
> > > not necessarily the only way to submit opaque arguments.  For
> > > virglrenderer, the code paths for resource creation and command buffer
> > > submission are a bit different, so it's nice to keep it that way going
> > > forward.
> > >
> > > Note: allocating memory via DRM_IOCTL_VIRTGPU_EXECBUFFER and creating
> > > a virtio-gpu resource out of it later is fine.
> > >
> > > [snip]
> > > > It is also possible to treat the non-centralized allocator specially
> > > > and treat it as a global entity.  This allows the userspace to open
> > > > the DRI device once, but talk to its corresponding renderer as well as
> > > > the allocator at the same time.  I am not sure if this is useful to
> > > > the guest.  How to pass the opaque allocation parameters is also to be
> > > > resolved..
> > >
> > > Are you concerned about protocol collision?  That is, one protocol for
> > > EXECBUFFER and one protocol for resource creation?
> > >
> > > I think it's very solvable, using the capabilities.  Let's say capset
> > > 3 (due to the CAPSET_QUERY_FIX) defines the proposed resource creation
> > > protocol.
> > >
> > > capset 3, version 1 defines it for context type 0
> > > capset 3, version 2 defines it for context type 1
> > > capset 3, version 3 defines it for context type 2
> > >
> > > Similarly, the EXECBUFFER for a specific context be defined by capset
> > > 4 + i.  There are many ways to do it ... the execbuffer method to
> > > allocate can indeed be the resource create method to allocate,
> > > depending on the context type.
> > >
> > > Again, I want to stress the host capabilities define any protocol, not
> > > necessarily the context type.  The context type defines where the
> > > capabilities are fetched.