[Intel-gfx] [RFC v3 3/3] drm/doc/rfc: VM_BIND uapi definition

Niranjana Vishwanathapura niranjana.vishwanathapura at intel.com
Fri Jun 3 06:53:32 UTC 2022


On Wed, Jun 01, 2022 at 10:08:35PM -0700, Niranjana Vishwanathapura wrote:
>On Wed, Jun 01, 2022 at 11:27:17AM +0200, Daniel Vetter wrote:
>>On Wed, 1 Jun 2022 at 11:03, Dave Airlie <airlied at gmail.com> wrote:
>>>
>>>On Tue, 24 May 2022 at 05:20, Niranjana Vishwanathapura
>>><niranjana.vishwanathapura at intel.com> wrote:
>>>>
>>>> On Thu, May 19, 2022 at 04:07:30PM -0700, Zanoni, Paulo R wrote:
>>>> >On Tue, 2022-05-17 at 11:32 -0700, Niranjana Vishwanathapura wrote:
>>>> >> VM_BIND and related uapi definitions
>>>> >>
>>>> >> v2: Ensure proper kernel-doc formatting with cross references.
>>>> >>     Also add new uapi and documentation as per review comments
>>>> >>     from Daniel.
>>>> >>
>>>> >> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura at intel.com>
>>>> >> ---
>>>> >>  Documentation/gpu/rfc/i915_vm_bind.h | 399 +++++++++++++++++++++++++++
>>>> >>  1 file changed, 399 insertions(+)
>>>> >>  create mode 100644 Documentation/gpu/rfc/i915_vm_bind.h
>>>> >>
>>>> >> diff --git a/Documentation/gpu/rfc/i915_vm_bind.h b/Documentation/gpu/rfc/i915_vm_bind.h
>>>> >> new file mode 100644
>>>> >> index 000000000000..589c0a009107
>>>> >> --- /dev/null
>>>> >> +++ b/Documentation/gpu/rfc/i915_vm_bind.h
>>>> >> @@ -0,0 +1,399 @@
>>>> >> +/* SPDX-License-Identifier: MIT */
>>>> >> +/*
>>>> >> + * Copyright © 2022 Intel Corporation
>>>> >> + */
>>>> >> +
>>>> >> +/**
>>>> >> + * DOC: I915_PARAM_HAS_VM_BIND
>>>> >> + *
>>>> >> + * VM_BIND feature availability.
>>>> >> + * See typedef drm_i915_getparam_t param.
>>>> >> + */
>>>> >> +#define I915_PARAM_HAS_VM_BIND               57
>>>> >> +
>>>> >> +/**
>>>> >> + * DOC: I915_VM_CREATE_FLAGS_USE_VM_BIND
>>>> >> + *
>>>> >> + * Flag to opt-in for VM_BIND mode of binding during VM creation.
>>>> >> + * See struct drm_i915_gem_vm_control flags.
>>>> >> + *
>>>> >> + * A VM in VM_BIND mode will not support the older execbuff mode of binding.
>>>> >> + * In VM_BIND mode, execbuff ioctl will not accept any execlist (ie., the
>>>> >> + * &drm_i915_gem_execbuffer2.buffer_count must be 0).
>>>> >> + * Also, &drm_i915_gem_execbuffer2.batch_start_offset and
>>>> >> + * &drm_i915_gem_execbuffer2.batch_len must be 0.
>>>> >> + * DRM_I915_GEM_EXECBUFFER_EXT_BATCH_ADDRESSES extension must be provided
>>>> >> + * to pass in the batch buffer addresses.
>>>> >> + *
>>>> >> + * Additionally, I915_EXEC_NO_RELOC, I915_EXEC_HANDLE_LUT and
>>>> >> + * I915_EXEC_BATCH_FIRST of &drm_i915_gem_execbuffer2.flags must be 0
>>>> >> + * (not used) in VM_BIND mode. I915_EXEC_USE_EXTENSIONS flag must always be
>>>> >> + * set (See struct drm_i915_gem_execbuffer_ext_batch_addresses).
>>>> >> + * The buffers_ptr, buffer_count, batch_start_offset and batch_len fields
>>>> >> + * of struct drm_i915_gem_execbuffer2 are also not used and must be 0.
>>>> >> + */
>>>> >
>>>> >From that description, it seems we have:
>>>> >
>>>> >struct drm_i915_gem_execbuffer2 {
>>>> >        __u64 buffers_ptr;              -> must be 0 (new)
>>>> >        __u32 buffer_count;             -> must be 0 (new)
>>>> >        __u32 batch_start_offset;       -> must be 0 (new)
>>>> >        __u32 batch_len;                -> must be 0 (new)
>>>> >        __u32 DR1;                      -> must be 0 (old)
>>>> >        __u32 DR4;                      -> must be 0 (old)
>>>> >        __u32 num_cliprects; (fences)   -> must be 0 since using extensions
>>>> >        __u64 cliprects_ptr; (fences, extensions) -> contains an actual pointer!
>>>> >        __u64 flags;                    -> some flags must be 0 (new)
>>>> >        __u64 rsvd1; (context info)     -> repurposed field (old)
>>>> >        __u64 rsvd2;                    -> unused
>>>> >};
>>>> >
>>>> >Based on that, why can't we just get drm_i915_gem_execbuffer3 instead
>>>> >of adding even more complexity to an already abused interface? While
>>>> >the Vulkan-like extension thing is really nice, I don't think what
>>>> >we're doing here is extending the ioctl usage, we're completely
>>>> >changing how the base struct should be interpreted based on how the VM
>>>> >was created (which is an entirely different ioctl).
>>>> >
>>>> >From Rusty Russel's API Design grading, drm_i915_gem_execbuffer2 is
>>>> >already at -6 without these changes. I think after vm_bind we'll need
>>>> >to create a -11 entry just to deal with this ioctl.
>>>> >
>>>>
>>>> The only change here is removing the execlist support for VM_BIND
>>>> mode (other than natual extensions).
>>>> Adding a new execbuffer3 was considered, but I think we need to be careful
>>>> with that as that goes beyond the VM_BIND support, including any future
>>>> requirements (as we don't want an execbuffer4 after VM_BIND).
>>>
>>>Why not? it's not like adding extensions here is really that different
>>>than adding new ioctls.
>>>
>>>I definitely think this deserves an execbuffer3 without even
>>>considering future requirements. Just  to burn down the old
>>>requirements and pointless fields.
>>>
>>>Make execbuffer3 be vm bind only, no relocs, no legacy bits, leave the
>>>older sw on execbuf2 for ever.
>>
>>I guess another point in favour of execbuf3 would be that it's less
>>midlayer. If we share the entry point then there's quite a few vfuncs
>>needed to cleanly split out the vm_bind paths from the legacy
>>reloc/softping paths.
>>
>>If we invert this and do execbuf3, then there's the existing ioctl
>>vfunc, and then we share code (where it even makes sense, probably
>>request setup/submit need to be shared, anything else is probably
>>cleaner to just copypaste) with the usual helper approach.
>>
>>Also that would guarantee that really none of the old concepts like
>>i915_active on the vma or vma open counts and all that stuff leaks
>>into the new vm_bind execbuf.
>>
>>Finally I also think that copypasting would make backporting easier,
>>or at least more flexible, since it should make it easier to have the
>>upstream vm_bind co-exist with all the other things we have. Without
>>huge amounts of conflicts (or at least much less) that pushing a pile
>>of vfuncs into the existing code would cause.
>>
>>So maybe we should do this?
>
>Thanks Dave, Daniel.
>There are a few things that will be common between execbuf2 and
>execbuf3, like request setup/submit (as you said), fence handling 
>(timeline fences, fence array, composite fences), engine selection,
>etc. Also, many of the 'flags' will be there in execbuf3 also (but
>bit position will differ).
>But I guess these should be fine as the suggestion here is to
>copy-paste the execbuff code and having a shared code where possible.
>Besides, we can stop supporting some older feature in execbuff3
>(like fence array in favor of newer timeline fences), which will
>further reduce common code.
>
>Ok, I will update this series by adding execbuf3 and send out soon.
>

Does this sound reasonable?

struct drm_i915_gem_execbuffer3 {
        __u32 ctx_id;		/* previously execbuffer2.rsvd1 */

        __u32 batch_count;
        __u64 batch_addr_ptr;	/* Pointer to an array of batch gpu virtual addresses */

        __u64 flags;
#define I915_EXEC3_RING_MASK              (0x3f)
#define I915_EXEC3_DEFAULT                (0<<0)
#define I915_EXEC3_RENDER                 (1<<0)
#define I915_EXEC3_BSD                    (2<<0)
#define I915_EXEC3_BLT                    (3<<0)
#define I915_EXEC3_VEBOX                  (4<<0)

#define I915_EXEC3_SECURE               (1<<6)
#define I915_EXEC3_IS_PINNED            (1<<7)

#define I915_EXEC3_BSD_SHIFT     (8)
#define I915_EXEC3_BSD_MASK      (3 << I915_EXEC3_BSD_SHIFT)
#define I915_EXEC3_BSD_DEFAULT   (0 << I915_EXEC3_BSD_SHIFT)
#define I915_EXEC3_BSD_RING1     (1 << I915_EXEC3_BSD_SHIFT)
#define I915_EXEC3_BSD_RING2     (2 << I915_EXEC3_BSD_SHIFT)

#define I915_EXEC3_FENCE_IN             (1<<10)
#define I915_EXEC3_FENCE_OUT            (1<<11)
#define I915_EXEC3_FENCE_SUBMIT         (1<<12)

        __u64 in_out_fence;		/* previously execbuffer2.rsvd2 */

        __u64 extensions;		/* currently only for DRM_I915_GEM_EXECBUFFER_EXT_TIMELINE_FENCES */
};

With this, user can pass in batch addresses and count directly,
instead of as an extension (as this rfc series was proposing).

I have removed many of the flags which were either legacy or not
applicable to BM_BIND mode.
I have also removed fence array support (execbuffer2.cliprects_ptr)
as we have timeline fence array support. Is that fine?
Do we still need FENCE_IN/FENCE_OUT/FENCE_SUBMIT support?

Any thing else needs to be added or removed?

Niranjana

>Niranjana
>
>>-Daniel
>>-- 
>>Daniel Vetter
>>Software Engineer, Intel Corporation
>>http://blog.ffwll.ch


More information about the Intel-gfx mailing list