[virglrenderer-devel] multiprocess model and GL
Gurchetan Singh
gurchetansingh at chromium.org
Sat Feb 8 01:13:37 UTC 2020
On Thu, Feb 6, 2020 at 11:04 PM Chia-I Wu <olvaffe at gmail.com> wrote:
>
> On Thu, Feb 6, 2020 at 7:55 PM Gurchetan Singh
> <gurchetansingh at chromium.org> wrote:
> >
> > On Thu, Feb 6, 2020 at 3:01 PM Chia-I Wu <olvaffe at gmail.com> wrote:
> > >
> > > On Wed, Feb 5, 2020 at 7:52 PM Gurchetan Singh
> > > <gurchetansingh at chromium.org> wrote:
> > > >
> > > > On Tue, Feb 4, 2020 at 8:14 PM Chia-I Wu <olvaffe at gmail.com> wrote:
> > > > >
> > > > > On Tue, Feb 4, 2020 at 7:17 PM Gurchetan Singh
> > > > > <gurchetansingh at chromium.org> wrote:
> > > > > >
> > > > > > On Tue, Feb 4, 2020 at 1:04 PM Chia-I Wu <olvaffe at gmail.com> wrote:
> > > > > > >
> > > > > > > On Mon, Feb 3, 2020 at 7:20 PM Gurchetan Singh
> > > > > > > <gurchetansingh at chromium.org> wrote:
> > > > > > > >
> > > > > > > > On Fri, Jan 31, 2020 at 7:04 PM Chia-I Wu <olvaffe at gmail.com> wrote:
> > > > > > > > > > So I wouldn't call the resource create v2 protocol as execbuffer.
> > > > > > > > > I will say virgl_hw.h is a part of the virgl_protocol.h. Together,
> > > > > > > > > they define the virgl protocol. If you look at the various _FORMAT
> > > > > > > > > fields in virgl_protocol.h, they refers to the formats defined in
> > > > > > > > > virgl_hw.h.
> > > > > > > > >
> > > > > > > >
> > > > > > > > Disagree here -- virgl_hw.h is the source of truth. It's possible to
> > > > > > > > create a resource without including virgl_protocol.h, and we do it all
> > > > > > > > the time.
> > > > > > > >
> > > > > > > > For example, without VIRGL_BIND_STAGING or
> > > > > > > > virgl_supported_format_mask, the guest can't know which type of
> > > > > > > > resources can be created.
> > > > > > > >
> > > > > > > > Without VIRGL_CAP_COMPUTE_SHADER, the guest can't know something like
> > > > > > > > VIRGL_SET_SHADER_BUFFER_* is supported. All defined in virgl_hw.h
> > > > > > >
> > > > > > > How are VIRGL_BIND_STAGING or VIRGL_CAP_COMPUTE_SHADER applied to virv
> > > > > > > resources or execbuffer? Will virgl_hw.h be extended in a way that it
> > > > > > > provides unified caps and resource model that both virgl and virv
> > > > > > > share? Or will virgl_hw.h be extended in a way that virgl and virv
> > > > > > > have very little in common? If there were going to be two worlds
> > > > > > > coexist in the extended virgl_hw.h, I wouldn't call it the source of
> > > > > > > truth (two sources of two truths maybe?)
> > > > > >
> > > > > > Multiple sources are perfectly valid. The virglrenderer protocol is a
> > > > > > superset of all capset defined protocols. capset 1 and capset 2
> > > > > > (virgl_hw.h) only apply to virgl (context type 0).
> > > > > >
> > > > > > capset 3 version 1 can apply to context type 0
> > > > > > capset 3 version 2 can apply to context type 1
> > > > > > capset 4 version 1 can apply to context type 1
> > > > > >
> > > > > > Each capability defines its own protocol. We can drop support for
> > > > > > capset 1 and capset 2 once OpenGL is deprecated.
> > > > > It sounds like there are 1 resource creation protocol (virgl_hw.h) and
> > > > > 1 execbuffer protocol (virgl_protocol.h) for virgl support. And you
> > > > > want 2 resource creation protocols and 2 execbuffer protocols in total
> > > > > after we add virv support, which are not what I had in mind when you
> > > > > say extending....
> > > >
> > > > We can structure it in various ways; if you don't want capset 3,
> > > > virvrenderer can define its resource creation protocol in capset 4
> > > > along with the execbuffer protocol (like virgl defines both resource
> > > > creation and execbuffer in capset {1, 2}).
> > > >
> > > > The number of protocols doesn't bother me (since they are cached) --
> > > > is that your main concern?
> > > No, no real concern, just confusions. You brought up that there are a
> > > resource creation protocol and a execbuffer procotol, and you extended
> > > them. I did not expect the end result to be two resource creation
> > > protocols and two execbuffer protocols.
> > >
> > > > >
> > > > > >
> > > > > > >
> > > > > > >
> > > > > > > > > > I think a more fair way to put this is that some of the virgl protocol
> > > > > > > > > details are leaked by drm_virtgpu_resource_create. IMO, a reasonable
> > > > > > > > > way to plug the leak is to make sure the virgl protocol details are
> > > > > > > > > contained in EXECBUFFER in resource create v2.
> > > > > > > >
> > > > > > > > I would put this way: both resource creation and the execbuffer
> > > > > > > > protocol are defined by the capabilities, and the virglrenderer
> > > > > > > > protocol is the superset of all capability-defined protocols.
> > > > > > > >
> > > > > > > > Honestly, the existing interface isn't too bad. We just need to extend it ...
> > > > > .... My view of the existing interface is that the virgl execbuffer
> > > > > protocol (virgl_protocol.h) is good, but the resource creation
> > > > > protocol (virgl_hw.h) is virgl-specific. To extend them for virv, I
> > > > > introduce a new command to the resource creation protocol
> > > > >
> > > > > RESOURCE_CREATE_v2(ctx_id, obj_id)
> > > >
> > > > I think we are converging with something like memory-v4 (one ioctl,
> > > > one kick, opaque arguments) and it supports our use cases, so I think
> > > > we should continue iterating on that model rather than going somewhere
> > > > different.
> > > It is the RESOURCE_CREATE_BLOB command in memory-v4, just to be clear.
> > >
> > > >
> > > > The questions in my mind with memory-v4 where:
> > > >
> > > > 1) VIRTGPU_RESOURCE_FLAG_ALLOC_EXECBUFFER vs.
> > > > VIRTGPU_RESOURCE_FLAG_ALLOC_{3D, ARGS} -- since in Virgl, execbuffer
> > > > is used for (pipe context) rendering commands commands and not (pipe
> > > > screen) allocations, I'd tend to support the latter. Do you
> > > > agree/disagree?
> > > I think virgl execbuffer can be extended to support (pipe screen)
> > > allocations. virgl execbuffer is also used to for (pipe screen)
> > > context creations so I don't see it as a layering violation.
> >
> > With alloc_param.virgl, there's no need to extend virgl execbuffer?
>
> Allocation-wise, yes (and it has always been on my branches...). It
> is modeled after v1, which is modeled after gallium. I don't see any
> problem.
>
> The problem is hostmem pixel layout (stride, etc.). Unless the
> userspace can query it from elsewhere (*cough* execbuffer), it can
> only be used for blobs or non-mappable images. But there is no
> mappable images in GL anyway.
There is ARB_buffer_storage and GL_EXT_memory_object_fd.
>
> On the other hand, GBM has gbm_bo_map. GBM can stay zero copy until
> an app tries to map a 2D GBM bo... Hmm, maybe it is not that useful
> after all?
Remember, we virtualize both Android and Linux using virglrenderer.
In gralloc, there is the GRALLOC_USAGE_SW_WRITE_OFTEN meaning the
buffer doesn't need to be tiled, meaning zero copy is useful.
>
> >
> > >
> > > > 2) SHARED_ALLOW vs SHARED_REQUIRE -- SHARED_REQUIRE is the way to go.
> > > > Maybe just HOSTVMM_3D since VMM creates the dma-buf?
> > > > 3) {GET_HOST_BACKING, RELEASE_HOST_BACKING} vs. {ATTACH_HOST_BACKING,
> > > > DETACH_HOST_BACKING} vs. {RESOURCE_MAP, RESOURCE_UNMAP} -- something
> > > > like the first two seem nice, since they mirror what we have for guest
> > > > memory and we want to do want to batch hypercalls. Thoughts?
> > > >
> > > > The questions you had were:
> > > >
> > > > 1) Limiting the 3D create protocol -- I agree, I think it's reasonable
> > > > to limit the number of bytes (of course versioning that limit will be
> > > > nice)
> > > I think RESOURCE_CREATE_BLOB covers all needs I can see. If I were
> > > wrong, VIRTGPU_RESOURCE_FLAG_ALLOC_MASK could be extended to fix my
> > > mistake. I am not too eager to give up all controls to userspace by
> > > providing args[] and only args[].
> >
> > Are you concerned for security reasons? The host is supposed to
> > validate all incoming commands, and the DRM_VIRTGPU_EXECBUFFER (which
> > will also allocate) command stream will be much larger and harder to
> > validate.
> >
> > Userspace already has much control, and for v1 resource creation it's
> > already defined by the capabilities.
> Not security. If the kernel believes a minimal RESOURCE_CREATE_BLOB
> combined with the execbuffer ioctl can provide all necessary
In memory-v4, there is no separate execbuffer ioctl (there is a
VIRTIO_GPU_CMD_SUBMIT_3D as a batched hypercall), as it's embedded
with the drm_virtgpu_resource_create_blob struct. Do you agree with
the one-ioctl, one-kick model? Or do you favor suppressing VQ kicks
from userspace to batch?
> functionality, why give the userspace even more freedom and risk
> painting itself into a corner?
Which additional freedom are you afraid of? Allocation can be done
both via DRM_IOCTL_VIRTGPU_RESOURCE_CREATE_BLOB and
DRM_IOCTL_VIRTGPU_EXECBUFFER, and both the interfaces are defined by
the host capabilities.
>
> Speaking for myself, I am uncertain if RESOURCE_CREATE_BLOB+execbuffer
> is enough, and wonder if the kernel should give up and go with args[].
> That's why there is VIRTGPU_RESOURCE_FLAG_ALLOC_MASK in case
> execbuffer is not enough. If you have good use cases that are
> impossible with execbuffer, you should bring them up.
There've been many proposals, so it's hard to determine exactly where
everyone is at given moment.
Now that we're converging on the memory-v4 model (one-ioctl, one kick,
opaque arguments) [I think??], the question now is how to name the
opaque arguments and route to virglrenderer.
To do an host side allocation, will it be:
(1) DRM_IOCTL_VIRTGPU_RESOURCE_CREATE_BLOB
--> VIRTIO_GPU_CMD_SUBMIT_3D -->
virgl_renderer_submit_cmd(void *buffer, int ctx_id, int ndw) to do the
allocation
--> VIRTIO_GPU_CMD_RESOURCE_CREATE_BLOB -->
virgl_renderer_resource_create_blob(resid, size, flags, object_id,
...) to consume the object id and setup res_id -> struct mapping
or
(2) DRM_IOCTL_VIRTGPU_RESOURCE_CREATE_BLOB
--> (VIRTIO_GPU_CMD_SUBMIT_3D /
VIRTIO_GPU_CMD_RESOURCE_CREATE_BLOB somehow batched by VMM) -->
virgl_renderer_resource_create_blob(resid, size, flags, void
*resource_create_3d, int ndw, ..) to allocate and setup setup res_id
-> struct mapping
I think you favor (1) because you view driver allocation/resource
creation as separate. I favor (2) because object_ids can become an
implementation detail too and we won't have to migrate Mesa to that
flow. Is that an accurate depiction of the current difference?
>
>
> > >
> > > On the other hand, I can see that it is tempting to provide args[] and
> > > leave this all to userspace. It is possible that some years later, we
> > > finally see why args[] is the way to go (and we extend
> > > VIRTGPU_RESOURCE_FLAG_ALLOC_MASK). But, IMHO, we are not there yet.
> > >
> > > > 2) How to define protocols -- I think the answer here is the capabilities set.
> > > >
> > > > Are there other outstanding questions you have??
> > > >
> > > > > This will be the core resource creation command going forward. It can
> > > > > create resources from context-local objects. Once virgl_protocol.h is
> > > > > extended to support local allocations, it can be used with
> > > > > virgl_protocol.h to replace the v1 resource creation command.
> > > >
> > > > As someone (the only one?) who's looked into the virglrenderer
> > > > implementation, I think it's easier to extend the current model rather
> > > > than requiring object IDs for Virgl. It's just the way the code is
> > > > structured, and it's structured that way on based the Gallium model.
> > > > That model is completely valid and with Zink not going away anytime
> > > > soon, and I think virv could require it too.
> > > >
> > > > >
> > > > > The new command is independent from execbuffer protocols. When virv
> > > > > execbuffer protocol is added, the core resource command can be used
> > > > > with virv execbuffer protocol automatically. There will be 1 resource
> > > > > creation protocol and 2 execbuffer protocols.
> > > > >
> > > > >
> > > > > > > >
> > > > > > > > [snip]
> > > > > > > > > I think that gives a very pure resource model and resource API, but
> > > > > > > > > for no practical benefit other than pureness. Keeping
> > > > > > > > > VIRTGPU_RESOURCE_FLAG_ALLOC_MASK and adding
> > > > > > > > > VIRTGPU_RESOURCE_FLAG_ALLOC_CLASSIC (and adding
> > > > > > > > > drm_virtgpu_resource_create to the union) seem like a good compromise
> > > > > > > > > to me. The guest userspace can use the classic alloc type to allocate
> > > > > > > > > a "classic" host storage. The host can choose to allocate using gbm
> > > > > > > > > or other apis.
> > > > > > > >
> > > > > > > > A pure resource model and API is a good thing! Honestly, I'd prefer
> > > > > > > > we not bikeshed over style too much. We have an implementation that
> > > > > > > > doesn't require something like ALLOC_CLASSIC but solves the same
> > > > > > > > problem, so why not go with that?
> > > > > > >
> > > > > > > It has something like VIRTGPU_RESOURCE_FLAG_ALLOC_ARGS and leaves
> > > > > > > everything to the userspace (to fill out args[]). In fact, the alloc
> > > > > > > type can be encoded in the first byte of args[] and there will be no
> > > > > > > need for VIRTGPU_RESOURCE_FLAG_ALLOC_MASK. I won't miss any
> > > > > > > functionality either because I can, for example, encode
> > > > > > > VIRTGPU_RESOURCE_FLAG_ALLOC_EXECBUFFER in args[] first, encode object
> > > > > > > id next, and then encode the command stream.
> > > > > >
> > > > > > Yeah, if you want to make the one protocol (caps 3, v2) reference
> > > > > > another protocol (caps 4), that's possible (if they're both
> > > > > > supported).
> > > > > >
> > > > > > >
> > > > > > > Like I said, I am not entirely against args[] as long as it supports
> > > > > > > my use cases (and it does). What I am not sure about is how wild do
> > > > > > > we allow args[] to be? The kernel does not care so the userspace can
> > > > > > > do whatever it wants?
> > > > > >
> > > > > > Complex rendering commands while creating a virtio-gpu resource are
> > > > > > not desired. It's reasonable to limit the maximum size of the
> > > > > > resource create protocol to something like 64 bytes.
> > > > > >
> > > > > > > That is one extreme. The other extreme is to
> > > > > > > say that the kernel only accepts execbuffer so args[] is actually
> > > > > > > cmds[]+object_id. That may seem limiting, but that could work too
> > > > > > > because you could put whatever would go into args[] into cmds[].
> > > > > > >
> > > > > > > Why not go with args[]? There is cmds[] that the userspace can do
> > > > > > > whatever it wants already, including allocations. Why add another
> > > > > > > mechanism that is allocation-only?
> > > > > >
> > > > > > Because the new ioctl is related to resource creation. It makes sense
> > > > > > to limit it to that purpose. We have the capabilites -- let's use
> > > > > > them.
> > > > >
> > > > > I see driver allocation and resource creation (RESOURCE_CREATE_v2
> > > > > above) as independent steps.
> > > >
> > > > Reasonable minds differ here, and so does userspace. I think the
> > > > right answer is support allocation in RESOURCE_CREATE_{BLOB, V2} and
> > > > support allocations via EXECBUFFER.
> > > >
> > > > > The fact that it is possible to do both
> > > > > in one ioctl is a kernel trick to reduce context switches. I also
> > > > > have use cases where I will do execbuffer and resource create v2
> > > > > separately.
> > > > >
> > > > > Caps can be used to decide whether virgl or virv are supported, or
> > > > > whether virgl supports local allocations, or the resource creation
> > > > > protocol supports RESOURCE_CREATE_v2 above. What do you mean by not
> > > > > using them?
> > > >
> > > > I want the capsets to version the resource create_{blob, v2} function
> > > > at the host/guest userspace levels.
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > > >
> > > > > > > Separating args[] out from cmds[]
> > > > > > > gives some more flexibility, but I am not sure if the flexibility is
> > > > > > > justified.
> > > > > >
> > > > > > There are many workflows and use cases. Flexibility and automatic
> > > > > > versioning is good to have. Using EXECBUFFER for virgl is not ideal,
> > > > > > and since it's only called when glFlush() / glFinish() by Mesa, it
> > > > > > really currently models fenced command buffer submission:
> > > > > >
> > > > > > https://github.com/torvalds/linux/blob/master/include/uapi/drm/i915_drm.h#L939
> > > > > > https://github.com/torvalds/linux/blob/master/include/uapi/drm/i915_drm.h#L1049
> > > > > > https://github.com/torvalds/linux/blob/master/include/uapi/drm/msm_drm.h#L180
> > > > > > https://github.com/torvalds/linux/blob/master/include/uapi/drm/msm_drm.h#L231
> > > > > > https://github.com/torvalds/linux/blob/master/include/uapi/drm/panfrost_drm.h#L49
> > > > > > https://github.com/torvalds/linux/blob/master/include/uapi/drm/vc4_drm.h#L91
> > > > > > ....
> > > > > >
> > > > > > We're trying to emulate GEM_CREATE...
> > > > > GEM_CREATE is emulated with RESOURCE_CREATE_v2 above which is pretty
> > > > > simple and straightforward. EXECBUFFER is an extra step inserted
> > > > > before GEM_CREATE due to the fact that the host requires vastly
> > > > > different parameters to allocate depending on the driver APIs.
> > > > >
> > > > > For comparison, GEM_CREATE is also emulated with RESOURCE_CREATE v1.
> > > > > It does not need the extra step because it is kind of virgl-specific
> > > > > whose allocation parameters are known and fixed.
> > > > >
> > > > >
> > > > >
> > > > > >
> > > > > >
> > > > > > >
> > > > > > > > [snip]
> > > > > > > > > The main thing that model misses is that VK renderers want to make
> > > > > > > > > local allocations when they are not to be mapped nor shared. If that
> > > > > > > > > model did allow local allocations, would VK renderers use objid to
> > > > > > > > > identify local allocations and use resid to identify resources?
> > > > > > > >
> > > > > > > > Local allocations are fine. Using DRM_IOCTL_VIRTGPU_EXECBUFFER to
> > > > > > > > allocate and then doing DRM_VIRTGPU_RESOURCE_CREATE_BLOB with
> > > > > > > > something like VIRTGPU_STRUCTURE_TYPE_BIND_OBJECT_ID with the variadic
> > > > > > > > arguments when needed should work, no?
> > > > > > > Yes. I was responding to Dave about the classic model missing the
> > > > > > > need for local allocations. I believe args[] gets it and all other
> > > > > > > needs covered.
> > > > > > >
> > > > > > >
> > > > > > > > [snip]
> > > > > > > > > That model also encourages implicit exports/imports. It does not want the
> > > > > > > > > guest to know that the centralized allocator might export and the
> > > > > > > > > renderers might import. But when the renderers do not have enough
> > > > > > > > > info to import, or when the renderers want to associate an objid with
> > > > > > > > > a resid, they rely on the guest to explicitly import again.
> > > > > > > >
> > > > > > > > There's many ways to import/export. I pointed a way using
> > > > > > > >
> > > > > > > > VIRTIO_GPU_CMD_CTX_ATTACH_RESOURCE
> > > > > > > > DMA_BUF_SET_NAME/ DMA_BUF_GET_NAME
> > > > > > > > MODIFY_INCOMING_IMPORT/MODIFY_INCOMING_EXPORT
> > > > > > > >
> > > > > > > > earlier in this thread. I think we should hold off that discussion
> > > > > > > > because the proposed model doesn't necessarily require implicit
> > > > > > > > export/import, though IMO it's not a huge issue.
> > > > > > > >
> > > > > > > > [snip]
> > > > > > > > > I think Gurchetan started from that model and fixed all the issues.
> > > > > > > > > As a result, the allocation parameters are in a new opaque args[]
> > > > > > > > > rather than in the existing opaque cmd[]. I like cmd[] more because
> > > > > > > > >
> > > > > > > > > - I use execbuffer in VK renderers to make local allocations and
> > > > > > > > > import/export resources already, and
> > > > > > > > > - I did not start from that model; instead I view execbuffer as the
> > > > > > > > > way to pass opaque parameters
> > > > > > > >
> > > > > > > > It makes sense to continue building on the current model, where the
> > > > > > > > execbuffer protocol is defined by DRM_IOCTL_VIRTGPU_EXECBUFFER, and
> > > > > > > > not necessarily the only way to submit opaque arguments. For
> > > > > > > > virglrenderer, the code paths for resource creation and command buffer
> > > > > > > > submission are a bit different, so it's nice to keep it that way going
> > > > > > > > forward.
> > > > > > > >
> > > > > > > > Note: allocating memory via DRM_IOCTL_VIRTGPU_EXECBUFFER and creating
> > > > > > > > a virtio-gpu resource out of it later is fine.
> > > > > > > >
> > > > > > > > [snip]
> > > > > > > > > It is also possible to treat the non-centralized allocator specially
> > > > > > > > > and treat it as a global entity. This allows the userspace to open
> > > > > > > > > the DRI device once, but talk to its corresponding renderer as well as
> > > > > > > > > the allocator at the same time. I am not sure if this is useful to
> > > > > > > > > the guest. How to pass the opaque allocation parameters is also to be
> > > > > > > > > resolved..
> > > > > > > >
> > > > > > > > Are you concerned about protocol collision? That is, one protocol for
> > > > > > > > EXECBUFFER and one protocol for resource creation?
> > > > > > > >
> > > > > > > > I think it's very solvable, using the capabilities. Let's say capset
> > > > > > > > 3 (due to the CAPSET_QUERY_FIX) defines the proposed resource creation
> > > > > > > > protocol.
> > > > > > > >
> > > > > > > > capset 3, version 1 defines it for context type 0
> > > > > > > > capset 3, version 2 defines it for context type 1
> > > > > > > > capset 3, version 3 defines it for context type 2
> > > > > > > >
> > > > > > > > Similarly, the EXECBUFFER for a specific context be defined by capset
> > > > > > > > 4 + i. There are many ways to do it ... the execbuffer method to
> > > > > > > > allocate can indeed be the resource create method to allocate,
> > > > > > > > depending on the context type.
> > > > > > > >
> > > > > > > > Again, I want to stress the host capabilities define any protocol, not
> > > > > > > > necessarily the context type. The context type defines where the
> > > > > > > > capabilities are fetched.
> > _______________________________________________
> > virglrenderer-devel mailing list
> > virglrenderer-devel at lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/virglrenderer-devel
More information about the virglrenderer-devel
mailing list