[Intel-gfx] [RFC PATCH 1/5] drm/doc/rfc: i915 GuC submission / DRM scheduler integration plan

Daniel Vetter daniel at ffwll.ch
Tue May 11 14:34:14 UTC 2021


On Thu, May 06, 2021 at 10:30:45AM -0700, Matthew Brost wrote:
> Add entry for i915 GuC submission / DRM scheduler integration plan.
> Follow up patch with details of new parallel submission uAPI to come.
> 
> Cc: Jon Bloomfield <jon.bloomfield at intel.com>
> Cc: Jason Ekstrand <jason at jlekstrand.net>
> Cc: Dave Airlie <airlied at gmail.com>
> Cc: Daniel Vetter <daniel.vetter at intel.com>
> Cc: Jason Ekstrand <jason at jlekstrand.net>
> Cc: dri-devel at lists.freedesktop.org
> Signed-off-by: Matthew Brost <matthew.brost at intel.com>

Would be good to Cc: some drm/scheduler folks here for the next round:

$ scripts/get_maintainer.pl -f -- drivers/gpu/drm/scheduler/

says we have maybe the following missing:

"Christian König" <christian.koenig at amd.com>
Luben Tuikov <luben.tuikov at amd.com>
Alex Deucher <alexander.deucher at amd.com>
Steven Price <steven.price at arm.com>

Lee Jones did a ton of warning fixes over the entire tree, so doesn't care
about drm/scheduler design directly.

> ---
>  Documentation/gpu/rfc/i915_scheduler.rst | 70 ++++++++++++++++++++++++
>  Documentation/gpu/rfc/index.rst          |  4 ++
>  2 files changed, 74 insertions(+)
>  create mode 100644 Documentation/gpu/rfc/i915_scheduler.rst
> 
> diff --git a/Documentation/gpu/rfc/i915_scheduler.rst b/Documentation/gpu/rfc/i915_scheduler.rst
> new file mode 100644
> index 000000000000..fa6780a11c86
> --- /dev/null
> +++ b/Documentation/gpu/rfc/i915_scheduler.rst
> @@ -0,0 +1,70 @@
> +=========================================
> +I915 GuC Submission/DRM Scheduler Section
> +=========================================
> +
> +Upstream plan
> +=============
> +For upstream the overall plan for landing GuC submission and integrating the
> +i915 with the DRM scheduler is:
> +
> +* Merge basic GuC submission
> +	* Basic submission support for all gen11+ platforms
> +	* Not enabled by default on any current platforms but can be enabled via
> +	  modparam enable_guc
> +	* Lots of rework will need to be done to integrate with DRM scheduler so
> +	  no need to nit pick everything in the code, it just should be
> +	  functional and not regress execlists
> +	* Update IGTs / selftests as needed to work with GuC submission
> +	* Enable CI on supported platforms for a baseline
> +	* Rework / get CI heathly for GuC submission in place as needed
> +* Merge new parallel submission uAPI
> +	* Bonding uAPI completely incompatible with GuC submission

Maybe clarify that this isn't the only issue with the bonding uapi, so
perhaps add "Plus it has severe design issues in general, which is why we
want to retire it no matter what". Or something like that. Not sure we
should go into full details here, maybe as part of the next patch about
parallel submit and all that.

> +	* New uAPI adds I915_CONTEXT_ENGINES_EXT_PARALLEL context setup step
> +	  which configures contexts N wide
> +	* After I915_CONTEXT_ENGINES_EXT_PARALLEL a user can submit N batches to
> +	  a context in a single execbuf IOCTL and the batches run on the GPU in
> +	  paralllel
> +	* Initially only for GuC submission but execlists can be supported if
> +	  needed
> +* Convert the i915 to use the DRM scheduler
> +	* GuC submission backend fully integrated with DRM scheduler
> +		* All request queues removed from backend (e.g. all backpressure
> +		  handled in DRM scheduler)
> +		* Resets / cancels hook in DRM scheduler
> +		* Watchdog hooks into DRM scheduler
> +		* Lots of complexity of the GuC backend can be pulled out once
> +		  integrated with DRM scheduler (e.g. state machine gets
> +		  simplier, locking gets simplier, etc...)
> +	* Execlist backend will do the minimum required to hook in the DRM
> +	  scheduler so it can live next to the fully integrated GuC backend
> +		* Legacy interface
> +		* Features like timeslicing / preemption / virtual engines would
> +		  be difficult to integrate with the DRM scheduler and these
> +		  features are not required for GuC submission as the GuC does
> +		  these things for us
> +		* ROI low on fully integrating into DRM scheduler
> +		* Fully integrating would add lots of complexity to DRM
> +		  scheduler
> +	* Port i915 priority inheritance / boosting feature in DRM scheduler

Maybe a few words on what this does and why we care? Just so drm/scheduler
people know what's coming.

> +	* Remove in-order completion assumptions from DRM scheduler

I think it'd be good to put a few words here why we need this. We want to
use drm scheduler for dependencies, but rely on the hw/fw scheduler (or
well backend for execlist) to handle preemption, round-robin and that kind
of stuff. Hence we want to have all runnable requests in the backend
(excluding backpressure and stuff like that), and they can complete
out-of-order.

Maybe also highlight this one in the commit message to get drm/scheduler
folks' attention on this and the previous one for discussion.

> +	* Pull out i915 priority levels and use DRM priority levels
> +	* Optimize DRM scheduler as needed

Again if we have some items here (one that was discussed was direct
submit/retire for lower latency iirc?) might be good to list this here.

> +
> +New uAPI for basic GuC submission
> +=================================
> +No major changes are required to the uAPI for basic GuC submission. The only
> +change is a new scheduler attribute: I915_SCHEDULER_CAP_STATIC_PRIORITY_MAP.
> +This attribute indicates the 2k i915 user priority levels are statically mapped
> +into 3 levels as follows:
> +
> +* -1k to -1 Low priority
> +* 0 Medium priority
> +* 1 to 1k High priority
> +
> +This is needed because the GuC only has 4 priority bands. The highest priority
> +band is reserved with the kernel. This aligns with the DRM scheduler priority
> +levels too.

Please Cc: mesa and get an ack from Jason Ekstrand or Ken Graunke on this,
just to be sure.

> +
> +New parallel submission uAPI
> +============================
> +Details to come in a following patch.
> diff --git a/Documentation/gpu/rfc/index.rst b/Documentation/gpu/rfc/index.rst
> index a8621f7dab8b..018a8bf317a6 100644
> --- a/Documentation/gpu/rfc/index.rst
> +++ b/Documentation/gpu/rfc/index.rst
> @@ -15,3 +15,7 @@ host such documentation:
>  
>  * Once the code has landed move all the documentation to the right places in
>    the main core, helper or driver sections.
> +
> +.. toctree::
> +
> +    i915_scheduler.rst

Aside from the comments I think this is what we're aiming for wrt rfc
patches, so lgtm overall.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


More information about the Intel-gfx mailing list