[PATCH v7 00/20] drm/i915/vm_bind: Add VM_BIND functionality

Zanoni, Paulo R paulo.r.zanoni at intel.com
Fri Nov 18 23:53:34 UTC 2022


On Sat, 2022-11-12 at 23:57 -0800, Niranjana Vishwanathapura wrote:
> DRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to bind/unbind GEM
> buffer objects (BOs) or sections of a BOs at specified GPU virtual
> addresses on a specified address space (VM). Multiple mappings can map
> to the same physical pages of an object (aliasing). These mappings (also
> referred to as persistent mappings) will be persistent across multiple
> GPU submissions (execbuf calls) issued by the UMD, without user having
> to provide a list of all required mappings during each submission (as
> required by older execbuf mode).
> 
> This patch series support VM_BIND version 1, as described by the param
> I915_PARAM_VM_BIND_VERSION.
> 
> Add new execbuf3 ioctl (I915_GEM_EXECBUFFER3) which only works in
> vm_bind mode. The vm_bind mode only works with this new execbuf3 ioctl.
> The new execbuf3 ioctl will not have any execlist support and all the
> legacy support like relocations etc., are removed.
> 
> NOTEs:
> * It is based on below VM_BIND design+uapi rfc.
>   Documentation/gpu/rfc/i915_vm_bind.rst

Just as a FYI to everyone, there is a Mesa Iris implementation that
makes use of this series here:

https://gitlab.freedesktop.org/pzanoni/mesa/-/commits/upstream-vmbind/

Some notes on it:

- Tested on TGL and Alchemist (aka DG2).

- The code still has a lot of TODOs and some FIXMEs.

- It was somewhat tested with the common Linux benchmarks (Dota 2,
Manhattan, Xonotic, etc.) and survived most of what I threw at it. The
only problems I saw so far are:

  - Sometimes you get a random GPU hang with Dota 2, but it seems to go
away if you use INTEL_DEBUG=nofc . I'm investigating it right now.

  - Very very rarely DRM_IOCTL_SYNCOBJ_HANDLE_TO_FD returns EINVAL and
we don't properly recover. I am still trying to understand this one,
it's once in a blue moon that it happens.

- Performance seems to be mostly equivalent to non-vm-bind, but I
haven't spent a lot of time really gathering numbers since I'm mostly
debugging things. Using VM_PRIVATE BOs is the key here, you cut Dota's
performance by almost half if you don't use private BOs. Considering
how ABYSMALLY slower things get, I would assume there's probably
something the Kernel could do here to handle things a little faster.
The 'perf' tool shows a lot of i915 fence-related functions wasting CPU
time when we don't use private BOs.

- I am not really sure whether the implicit tracking code actually
works or not. It doesn't really seem to make much of a difference if I
remove it entirely, but I'm still planning to spend more time analyzing
this. If anybody knows of a workload that will absolutely fail if we
get this wrong, please tell me.

- There's the whole frontbuffer tracking discussion from v6 that's
still pending resolution.

- The Vulkan implementation will come later. I wanted to sort all the
GL details later so I don't make the same mistakes twice.

- I really dislike how we have to handle the lack of implicit tracking
support. The code is excessive and racy. See export_bo_sync_state(),
import_bo_sync_state() and their caller from
src/gallium/drivers/iris/iris_batch.c. Suggestions for Mesa
improvements are welcome, but I would really really really prefer to
have a way to just tell execbuffer3 to handle implicit tracking for
these buffers for me in an atomic way.

- I kinda wish execbuffer3 still accepted a pointer to struct
drm_i915_gem_exec_fence in addition to struct
drm_i915_gem_timeline_fence, since we already have to keep a list of
exec_fences for execbuf2, and then in the execbuf3 we convert them to
the new format. We could also do the opposite and leave execbuf2 with
the slower path. But I could live without this, no problem.

- Credits to Ken, Jason, Lionel, Niranjana, Nanley, Daniel and
everybody else who helped me sort things out here.


Is this ready to be merged to the Kernel? Maybe, but I'd like us to
sort these things out first:

1. Get conclusion regarding the frontbuffer tracking issue first.

2. Get some validation from more experienced people (*winks at Jason*)
that our approach with implicit tracking is correct here. Or convince
Niranjana to add a way to pass buffers for implicit tracking so the
Kernel can atomically inside execbuf3 what we're trying to do with 8
ioctls.

3. Fix all the Mesa bugs so we're 100% sure they're not design flaws of
the Kernel.

But that's just my humble opinion.

Thanks,
Paulo

> 
> * The IGT RFC series is posted as,
>   [PATCH i-g-t v7 0/13] vm_bind: Add VM_BIND validation support
> 
> v2: Address various review comments
> v3: Address review comments and other fixes
> v4: Remove vm_unbind out fence uapi which is not supported yet,
>     replace vm->vm_bind_mode check with i915_gem_vm_is_vm_bind_mode()
> v5: Render kernel-doc, use PIN_NOEVICT, limit vm_bind support to
>     non-recoverable faults
> v6: Rebased, minor fixes, add reserved fields to drm_i915_gem_vm_bind,
>     add new patch for async vm_unbind support
> v7: Rebased, minor cleanups as per review feedback
> 
> Test-with: 20221113075309.32023-1-niranjana.vishwanathapura at intel.com
> 
> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura at intel.com>
> 
> Niranjana Vishwanathapura (20):
>   drm/i915/vm_bind: Expose vm lookup function
>   drm/i915/vm_bind: Add __i915_sw_fence_await_reservation()
>   drm/i915/vm_bind: Expose i915_gem_object_max_page_size()
>   drm/i915/vm_bind: Add support to create persistent vma
>   drm/i915/vm_bind: Implement bind and unbind of object
>   drm/i915/vm_bind: Support for VM private BOs
>   drm/i915/vm_bind: Add support to handle object evictions
>   drm/i915/vm_bind: Support persistent vma activeness tracking
>   drm/i915/vm_bind: Add out fence support
>   drm/i915/vm_bind: Abstract out common execbuf functions
>   drm/i915/vm_bind: Use common execbuf functions in execbuf path
>   drm/i915/vm_bind: Implement I915_GEM_EXECBUFFER3 ioctl
>   drm/i915/vm_bind: Update i915_vma_verify_bind_complete()
>   drm/i915/vm_bind: Expose i915_request_await_bind()
>   drm/i915/vm_bind: Handle persistent vmas in execbuf3
>   drm/i915/vm_bind: userptr dma-resv changes
>   drm/i915/vm_bind: Limit vm_bind mode to non-recoverable contexts
>   drm/i915/vm_bind: Add uapi for user to enable vm_bind_mode
>   drm/i915/vm_bind: Render VM_BIND documentation
>   drm/i915/vm_bind: Async vm_unbind support
> 
>  Documentation/gpu/i915.rst                    |  78 +-
>  drivers/gpu/drm/i915/Makefile                 |   3 +
>  drivers/gpu/drm/i915/gem/i915_gem_context.c   |  43 +-
>  drivers/gpu/drm/i915/gem/i915_gem_context.h   |  17 +
>  drivers/gpu/drm/i915/gem/i915_gem_create.c    |  72 +-
>  drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c    |   6 +
>  .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 522 +----------
>  .../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 872 ++++++++++++++++++
>  .../drm/i915/gem/i915_gem_execbuffer_common.c | 671 ++++++++++++++
>  .../drm/i915/gem/i915_gem_execbuffer_common.h |  76 ++
>  drivers/gpu/drm/i915/gem/i915_gem_ioctls.h    |   2 +
>  drivers/gpu/drm/i915/gem/i915_gem_object.c    |   3 +
>  drivers/gpu/drm/i915/gem/i915_gem_object.h    |   2 +
>  .../gpu/drm/i915/gem/i915_gem_object_types.h  |   6 +
>  drivers/gpu/drm/i915/gem/i915_gem_userptr.c   |  19 +
>  drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h   |  30 +
>  .../drm/i915/gem/i915_gem_vm_bind_object.c    | 449 +++++++++
>  drivers/gpu/drm/i915/gt/intel_gtt.c           |  17 +
>  drivers/gpu/drm/i915/gt/intel_gtt.h           |  21 +
>  drivers/gpu/drm/i915/i915_driver.c            |   4 +
>  drivers/gpu/drm/i915/i915_drv.h               |   2 +
>  drivers/gpu/drm/i915/i915_gem_gtt.c           |  39 +
>  drivers/gpu/drm/i915/i915_gem_gtt.h           |   3 +
>  drivers/gpu/drm/i915/i915_getparam.c          |   3 +
>  drivers/gpu/drm/i915/i915_sw_fence.c          |  28 +-
>  drivers/gpu/drm/i915/i915_sw_fence.h          |  23 +-
>  drivers/gpu/drm/i915/i915_vma.c               | 186 +++-
>  drivers/gpu/drm/i915/i915_vma.h               |  68 +-
>  drivers/gpu/drm/i915/i915_vma_types.h         |  39 +
>  include/uapi/drm/i915_drm.h                   | 274 +++++-
>  30 files changed, 3027 insertions(+), 551 deletions(-)
>  create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer3.c
>  create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.c
>  create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_execbuffer_common.h
>  create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h
>  create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_vm_bind_object.c
> 



More information about the dri-devel mailing list