[Intel-gfx] [RFC PATCH 00/20] Initial Xe driver submission
Rodrigo Vivi
rodrigo.vivi at intel.com
Wed Mar 1 23:00:02 UTC 2023
On Fri, Feb 17, 2023 at 09:51:37PM +0100, Daniel Vetter wrote:
> Hi all,
>
> [I thought I've sent this out earlier this week, but alas got stuck, kinda
> bad timing now since I'm out next week but oh well]
>
> So xe is a quite substantial thing, and I think we need a clear plan how to land
> this or it will take forever, and managers will panic. Also I'm not a big fan of
> "Dave/me reviews everything", we defacto had that for amd's dc/dal and it was
> not fun. The idea here is how to get everything reviewed without having two
> people end up somewhat arbitrary as deciders.
Thank you so much for taking time to write it down. We need to get alignment
on the critical topics to see how we can move this forward.
>
> I've compiled a bunch of topics on what I think the important areas are, first
> code that should be consistent about new-style render drivers that are aimed for
> vk/compute userspace as the primary feature driver:
>
> - figure out consensus solution for fw scheduler and drm/sched frontend among
> interested driver parties (probably xe, amdgpu, nouveau, new panfrost)
Yeap. We do need to figure this out. But just to ensure that we are in the same
page here. What I had in mind was that Matt would upstream the 5 or 6 drm_sched
related patches that we have underneath Xe patches on drm-misc with addressing
the community feedback, then we would merge Xe with the current schedule solution
(or modifications based on the modifications of these mentioned patches) and
then we would continue to work with the other drivers to improve the drm sched
frontend while we are already in tree. Possible? or do you want to see
fundamental changes before we can even consider to get in? Like the ones below?
>
> - for the interface itself it might be good to have the drm_gpu_scheduler as the
> single per-hw-engine driver api object (but internally a new structure), while
> renaming the current drm_gpu_scheduler to drm_gpu_sched_internal. That way I
> think we can address the main critique of the current xe scheduler plan
> - keep the drm_gpu_sched_internal : drm_sched_entity 1:1 relationship for fw
> scheduler
> - keep the driver api relationship of drm_gpu_scheduler : drm_sched_entity
> 1:n, the api functions simply iterate over a mutex protect list of internal
> schedulers. this should also help drivers with locking mistakes around
> setup/teardown and gpu reset.
> - drivers select with a flag or something between the current mode (where the
> drm_gpu_sched_internal is attached to the drm_gpu_scheduler api object) or
> the new fw scheduler mode (where drm_gpu_sched_internal is attached to the
> drm_sched_entity)
> - overall still no fundamental changes (like the current patches) to drm/sched
> data structures and algorithms. But unlike the current patches we keep the
> possibility open for eventual refactoring without having to again refactor
> all the drivers. Even better, we can delay such refactoring until we have a
> handful of real-word drivers test-driving this all so we know we actually do
> the right thing. This should allow us to address all the
> fairness/efficiency/whatever concerns that have been floating around without
> having to fix them all up upfront, before we actually know what needs to be
> fixed.
do you believe this has to be decided and moved towards one of this before we
get merged?
>
> - the generic scheduler code should also including the handling of endless
> compute contexts, with the minimal scaffolding for preempt-ctx fences
> (probably on the drm_sched_entity) and making sure drm/sched can cope with the
> lack of job completion fence. This is very minimal amounts of code, but it
> helps a lot for cross-driver review if this works the same (with the same
> locking and all that) for everyone. Ideally this gets extracted from amdkfd,
> but as long as it's going to be used by all drivers supporting
> endless/compute context going forward it's good enough.
On this one I'm a bit clueless to be honest. I thought the biggest problem with
the long running context or even endless were due to the hangcheck premption or
migrations that would end in some pagefaults.
But yeap, it looks that there are opens to get these kind of workloads properly
supported. But with this in mind do you see any real blocker on Xe? or any must-have
thing?
>
> - I'm assuming this also means Matt Brost will include a patch to add himself as
> drm/sched reviewer in MAINTAINERS, or at least something like that
+1 on this idea!
This enforces our engagement and commitment with the drm_sched imho.
>
> - adopt the gem_exec/vma helpers. again we probably want consensus here among
> the same driver projects. I don't care whether these helpers specify the ioctl
> structs or not, but they absolutely need to enforce the overall locking scheme
> for all major structs and list (so vm and vma).
On this front I thought we would need to align on a common drm_vm_bind based on
the common parts of xe vm_bind and nouveau one. And also some engagement that
I thought it would be easier after we are integrated and part of the drm-next.
Do we need to do this earlier? Could you please open it a bit more on what
exactly you want to see before we can be considered to get merged or after?
>
> - we also should have cross-driver consensus on async vm_bind support. I think
> everyone added in-syncobj support, the real fun is probably more in/out
> userspace memory fences (and personally I'm still not sure that's a good idea
> but ... *eh*). I think cross driver consensus on how this should work (ideally
> with helper support so people don't get it wrong in all the possible ways)
> would be best.
Should the consensus API come first? should this block the nouveau implementation
and move us all towards the drm_vm_bind? or can we sync in-tree?
>
> - this also means some userptr integration and some consensus how userptr should
> work for vm_bind across drivers. I don't think allowing drivers to reinvent
> that wheel is a bright idea, there's just a bit too much to get wrong here.
ack. but kind of same question. is it a blocker to align before? or easier to
align in tree?
>
> - for some of these the consensus might land on more/less shared code than what
> I sketched out above, the important part really is that we have consensus on
> these. Kinda similar to how the atomic kms infrastructure move a _lot_ more of
> the code back into drivers, because they really just needed the flexibility to
> program the hw correctly. Right now we definitely don't have enough shared
> code, for sure with i915-gem, but we also need to make sure we're not
> overcorrecting too badly (a bit of overcorrecting generally doesn't hurt).
+1 on this. We need to work more in the drm layers like display has done successfully!
>
> All the above will make sure that the driver overall is in concepts and design
> aligned with the overall community direction, but I think it'd still be good if
> someone outside of the intel gpu group reviews the driver code itself. Last time
> we had a huge driver submission (amd's DC/DAL) this fell on Dave&me, but this
> time around I think we have a perfect candidate with Oded:
>
> - Oded needs/wants to spend some time on ramping up on how drm render drivers
> work anyway, and xe is probably the best example of a driver that's both
> supposed to be full-featured, but also doesn't contain an entire display
> driver on the side.
>
> - Oded is in Habana, which is legally part of Intel. Bean counter budget
> shuffling to make this happen should be possible.
>
> - Habana is still fairly distinct entity within Intel, so that is probably the
> best approach for some independent review, without making the xe team
> beholden to some non-Intel people.
+1 on this entire idea here as well.
>
> The above should yield some pretty clear road towards landing xe, without any
> big review fights with Dave/me like we had with amd's DC/DAL, which took a
> rather long time to land unfortunately :-(
As I wrote already, I really agree with you that we should work more with the
drm and more with the other drivers. But for the logistics of the work and
the rebase pains and to avoid a situation where we have a totally divergent
driver, I believe the fastest way is to solve any blockers and big issues
before, then merge, then work towards more collaboration on the next step.
Specially when with Xe we are not planning to remove the force_probe
flag for a while, what puts us in a "staging" situation.
We could even make use of the CONFIG_STAGING if needed.
Thoughts?
And most than that, any already know big blockers?
>
> These are just my thoughts, let the bikeshed commence!
:)
>
> Ideally we put them all into a TODO like we've done for DC/DAL, once we have
> some consensus.
I like the TODO list idea.
And also we need to use more the RFC doc section as well, like
i915-vmbind used.
On the TODO part, where do you recommend to add in the doc?
Again, thank you so much,
Rodrigo.
>
> Cheers, Daniel
>
> On Thu, Dec 22, 2022 at 02:21:07PM -0800, Matthew Brost wrote:
> > Hello,
> >
> > This is a submission for Xe, a new driver for Intel GPUs that supports both
> > integrated and discrete platforms starting with Tiger Lake (first platform with
> > Intel Xe Architecture). The intention of this new driver is to have a fresh base
> > to work from that is unencumbered by older platforms, whilst also taking the
> > opportunity to rearchitect our driver to increase sharing across the drm
> > subsystem, both leveraging and allowing us to contribute more towards other
> > shared components like TTM and drm/scheduler. The memory model is based on VM
> > bind which is similar to the i915 implementation. Likewise the execbuf
> > implementation for Xe is very similar to execbuf3 in the i915 [1].
> >
> > The code is at a stage where it is already functional and has experimental
> > support for multiple platforms starting from Tiger Lake, with initial support
> > implemented in Mesa (for Iris and Anv, our OpenGL and Vulkan drivers), as well
> > as in NEO (for OpenCL and Level0). A Mesa MR has been posted [2] and NEO
> > implementation will be released publicly early next year. We also have a suite
> > of IGTs for XE that will appear on the IGT list shortly.
> >
> > It has been built with the assumption of supporting multiple architectures from
> > the get-go, right now with tests running both on X86 and ARM hosts. And we
> > intend to continue working on it and improving on it as part of the kernel
> > community upstream.
> >
> > The new Xe driver leverages a lot from i915 and work on i915 continues as we
> > ready Xe for production throughout 2023.
> >
> > As for display, the intent is to share the display code with the i915 driver so
> > that there is maximum reuse there. Currently this is being done by compiling the
> > display code twice, but alternatives to that are under consideration and we want
> > to have more discussion on what the best final solution will look like over the
> > next few months. Right now, work is ongoing in refactoring the display codebase
> > to remove as much as possible any unnecessary dependencies on i915 specific data
> > structures there..
> >
> > We currently have 2 submission backends, execlists and GuC. The execlist is
> > meant mostly for testing and is not fully functional while GuC backend is fully
> > functional. As with the i915 and GuC submission, in Xe the GuC firmware is
> > required and should be placed in /lib/firmware/xe.
> >
> > The GuC firmware can be found in the below location:
> > https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/i915
> >
> > The easiest way to setup firmware is:
> > cp -r /lib/firmware/i915 /lib/firmware/xe
> >
> > The code has been organized such that we have all patches that touch areas
> > outside of drm/xe first for review, and then the actual new driver in a separate
> > commit. The code which is outside of drm/xe is included in this RFC while
> > drm/xe is not due to the size of the commit. The drm/xe is code is available in
> > a public repo listed below.
> >
> > Xe driver commit:
> > https://cgit.freedesktop.org/drm/drm-xe/commit/?h=drm-xe-next&id=9cb016ebbb6a275f57b1cb512b95d5a842391ad7
> >
> > Xe kernel repo:
> > https://cgit.freedesktop.org/drm/drm-xe/
> >
> > There's a lot of work still to happen on Xe but we're very excited about it and
> > wanted to share it early and welcome feedback and discussion.
> >
> > Cheers,
> > Matthew Brost
> >
> > [1] https://patchwork.freedesktop.org/series/105879/
> > [2] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20418
> >
> > Maarten Lankhorst (12):
> > drm/amd: Convert amdgpu to use suballocation helper.
> > drm/radeon: Use the drm suballocation manager implementation.
> > drm/i915: Remove gem and overlay frontbuffer tracking
> > drm/i915/display: Neuter frontbuffer tracking harder
> > drm/i915/display: Add more macros to remove all direct calls to uncore
> > drm/i915/display: Remove all uncore mmio accesses in favor of intel_de
> > drm/i915: Rename find_section to find_bdb_section
> > drm/i915/regs: Set DISPLAY_MMIO_BASE to 0 for xe
> > drm/i915/display: Fix a use-after-free when intel_edp_init_connector
> > fails
> > drm/i915/display: Remaining changes to make xe compile
> > sound/hda: Allow XE as i915 replacement for sound
> > mei/hdcp: Also enable for XE
> >
> > Matthew Brost (5):
> > drm/sched: Convert drm scheduler to use a work queue rather than
> > kthread
> > drm/sched: Add generic scheduler message interface
> > drm/sched: Start run wq before TDR in drm_sched_start
> > drm/sched: Submit job before starting TDR
> > drm/sched: Add helper to set TDR timeout
> >
> > Thomas Hellström (3):
> > drm/suballoc: Introduce a generic suballocation manager
> > drm: Add a gpu page-table walker helper
> > drm/ttm: Don't print error message if eviction was interrupted
> >
> > drivers/gpu/drm/Kconfig | 5 +
> > drivers/gpu/drm/Makefile | 4 +
> > drivers/gpu/drm/amd/amdgpu/Kconfig | 1 +
> > drivers/gpu/drm/amd/amdgpu/amdgpu.h | 26 +-
> > drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 14 +-
> > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 12 +-
> > drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c | 5 +-
> > drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 23 +-
> > drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 3 +-
> > drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c | 320 +-----------------
> > drivers/gpu/drm/drm_pt_walk.c | 159 +++++++++
> > drivers/gpu/drm/drm_suballoc.c | 301 ++++++++++++++++
> > drivers/gpu/drm/i915/Makefile | 2 +-
> > drivers/gpu/drm/i915/display/hsw_ips.c | 7 +-
> > drivers/gpu/drm/i915/display/i9xx_plane.c | 1 +
> > drivers/gpu/drm/i915/display/intel_atomic.c | 2 +
> > .../gpu/drm/i915/display/intel_atomic_plane.c | 25 +-
> > .../gpu/drm/i915/display/intel_backlight.c | 2 +-
> > drivers/gpu/drm/i915/display/intel_bios.c | 71 ++--
> > drivers/gpu/drm/i915/display/intel_bw.c | 36 +-
> > drivers/gpu/drm/i915/display/intel_cdclk.c | 68 ++--
> > drivers/gpu/drm/i915/display/intel_color.c | 1 +
> > drivers/gpu/drm/i915/display/intel_crtc.c | 14 +-
> > drivers/gpu/drm/i915/display/intel_cursor.c | 14 +-
> > drivers/gpu/drm/i915/display/intel_de.h | 38 +++
> > drivers/gpu/drm/i915/display/intel_display.c | 155 +++++++--
> > drivers/gpu/drm/i915/display/intel_display.h | 9 +-
> > .../gpu/drm/i915/display/intel_display_core.h | 5 +-
> > .../drm/i915/display/intel_display_debugfs.c | 8 +
> > .../drm/i915/display/intel_display_power.c | 40 ++-
> > .../drm/i915/display/intel_display_power.h | 6 +
> > .../i915/display/intel_display_power_map.c | 7 +
> > .../i915/display/intel_display_power_well.c | 24 +-
> > .../drm/i915/display/intel_display_reg_defs.h | 4 +
> > .../drm/i915/display/intel_display_trace.h | 6 +
> > .../drm/i915/display/intel_display_types.h | 32 +-
> > drivers/gpu/drm/i915/display/intel_dmc.c | 17 +-
> > drivers/gpu/drm/i915/display/intel_dp.c | 11 +-
> > drivers/gpu/drm/i915/display/intel_dp_aux.c | 6 +
> > drivers/gpu/drm/i915/display/intel_dpio_phy.c | 9 +-
> > drivers/gpu/drm/i915/display/intel_dpio_phy.h | 15 +
> > drivers/gpu/drm/i915/display/intel_dpll.c | 8 +-
> > drivers/gpu/drm/i915/display/intel_dpll_mgr.c | 4 +
> > drivers/gpu/drm/i915/display/intel_drrs.c | 1 +
> > drivers/gpu/drm/i915/display/intel_dsb.c | 124 +++++--
> > drivers/gpu/drm/i915/display/intel_dsi_vbt.c | 26 +-
> > drivers/gpu/drm/i915/display/intel_fb.c | 108 ++++--
> > drivers/gpu/drm/i915/display/intel_fb_pin.c | 6 -
> > drivers/gpu/drm/i915/display/intel_fbc.c | 49 ++-
> > drivers/gpu/drm/i915/display/intel_fbdev.c | 108 +++++-
> > .../gpu/drm/i915/display/intel_frontbuffer.c | 103 +-----
> > .../gpu/drm/i915/display/intel_frontbuffer.h | 67 +---
> > drivers/gpu/drm/i915/display/intel_gmbus.c | 2 +-
> > drivers/gpu/drm/i915/display/intel_hdcp.c | 9 +-
> > drivers/gpu/drm/i915/display/intel_hdmi.c | 1 -
> > .../gpu/drm/i915/display/intel_lpe_audio.h | 8 +
> > .../drm/i915/display/intel_modeset_setup.c | 11 +-
> > drivers/gpu/drm/i915/display/intel_opregion.c | 2 +-
> > drivers/gpu/drm/i915/display/intel_overlay.c | 14 -
> > .../gpu/drm/i915/display/intel_pch_display.h | 16 +
> > .../gpu/drm/i915/display/intel_pch_refclk.h | 8 +
> > drivers/gpu/drm/i915/display/intel_pipe_crc.c | 1 +
> > .../drm/i915/display/intel_plane_initial.c | 3 +-
> > drivers/gpu/drm/i915/display/intel_psr.c | 1 +
> > drivers/gpu/drm/i915/display/intel_sprite.c | 21 ++
> > drivers/gpu/drm/i915/display/intel_vbt_defs.h | 2 +-
> > drivers/gpu/drm/i915/display/intel_vga.c | 5 +
> > drivers/gpu/drm/i915/display/skl_scaler.c | 2 +
> > .../drm/i915/display/skl_universal_plane.c | 52 ++-
> > drivers/gpu/drm/i915/display/skl_watermark.c | 25 +-
> > drivers/gpu/drm/i915/gem/i915_gem_clflush.c | 4 -
> > drivers/gpu/drm/i915/gem/i915_gem_domain.c | 7 -
> > .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 2 -
> > drivers/gpu/drm/i915/gem/i915_gem_object.c | 25 --
> > drivers/gpu/drm/i915/gem/i915_gem_object.h | 22 --
> > drivers/gpu/drm/i915/gem/i915_gem_phys.c | 4 -
> > drivers/gpu/drm/i915/gt/intel_gt_regs.h | 3 +-
> > drivers/gpu/drm/i915/i915_driver.c | 1 +
> > drivers/gpu/drm/i915/i915_gem.c | 8 -
> > drivers/gpu/drm/i915/i915_gem_gtt.c | 1 -
> > drivers/gpu/drm/i915/i915_reg_defs.h | 8 +
> > drivers/gpu/drm/i915/i915_vma.c | 12 -
> > drivers/gpu/drm/radeon/radeon.h | 55 +--
> > drivers/gpu/drm/radeon/radeon_ib.c | 12 +-
> > drivers/gpu/drm/radeon/radeon_object.h | 25 +-
> > drivers/gpu/drm/radeon/radeon_sa.c | 314 ++---------------
> > drivers/gpu/drm/radeon/radeon_semaphore.c | 6 +-
> > drivers/gpu/drm/scheduler/sched_main.c | 182 +++++++---
> > drivers/gpu/drm/ttm/ttm_bo.c | 3 +-
> > drivers/misc/mei/hdcp/Kconfig | 2 +-
> > drivers/misc/mei/hdcp/mei_hdcp.c | 3 +-
> > include/drm/drm_pt_walk.h | 161 +++++++++
> > include/drm/drm_suballoc.h | 112 ++++++
> > include/drm/gpu_scheduler.h | 41 ++-
> > sound/hda/hdac_i915.c | 17 +-
> > sound/pci/hda/hda_intel.c | 56 +--
> > sound/soc/intel/avs/core.c | 13 +-
> > sound/soc/sof/intel/hda.c | 7 +-
> > 98 files changed, 2076 insertions(+), 1325 deletions(-)
> > create mode 100644 drivers/gpu/drm/drm_pt_walk.c
> > create mode 100644 drivers/gpu/drm/drm_suballoc.c
> > create mode 100644 include/drm/drm_pt_walk.h
> > create mode 100644 include/drm/drm_suballoc.h
> >
> > --
> > 2.37.3
> >
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
More information about the dri-devel
mailing list