[Intel-gfx] [PATCH 00/15] Batch submission via GuC

Dave Gordon david.s.gordon at intel.com
Thu Jun 25 00:23:08 PDT 2015


On 17/06/15 13:43, Daniel Vetter wrote:
> On Mon, Jun 15, 2015 at 07:36:18PM +0100, Dave Gordon wrote:
>> This patch series enables command submission via the GuC. In this mode,
>> instead of the host CPU driving the execlist port directly, it hands
>> over work items to the GuC, using a doorbell mechanism to tell the GuC
>> that new items have been added to its work queue. The GuC then dispatches
>> contexts to the various GPU engines, and manages the resulting context-
>> switch interrupts. Completion of a batch is however still signalled to
>> the CPU; the GuC is not involved in handling user interrupts.
>>
>> There are three subsequences within the patch series:
>>
>>   drm/i915: Add i915_gem_object_write() to i915_gem.c
>>   drm/i915: Embedded microcontroller (uC) firmware loading support
>>
>> These first two patches provide a generic framework for fetching the
>> firmware that may be required by any embedded microcontroller from a
>> file, using an asynchronous thread so that driver initialisation can
>> continue while the firmware is being fetched. It is hoped that this
>> framework is sufficiently general that it can be used for all curent
>> and future microcontrollers.
>>
>>   drm/i915: Add GuC-related module parameters
>>   drm/i915: Add GuC-related header files
>>   drm/i915: GuC-specific firmware loader
>>   drm/i915: Debugfs interface to read GuC load status
> 
> Does that include all the nifty power management stuff GuC does?

No; the GuC f/w may be doing such things but I don't have any code to
interrogate it about power management. None of that appears in the GuC
submission HLD, so I'd guess we're not presenting that until it has a
stable i/f.

>> These four patches complete the GuC loader. At this point in the sequence
>> we can load and activate the GuC firmware, but not submit any batches
>> through it. (This is nonetheless a potentially useful state, as the GuC
>> can do other useful work even when not handling batch submissions).
>>
>>   drm/i915: Defer default hardware context initialisation until first
>>   drm/i915: Move execlists defines from .c to .h
>>   drm/i915: GuC submission setup, phase 1
>>   drm/i915: Enable GuC firmware log
>>   drm/i915: Implementation of GuC client
>>   drm/i915: Interrupt routing for GuC submission
>>   drm/i915: Integrate GuC-based command submission
>>   drm/i915: Debugfs interface for GuC submission statistics
>>   Documentation/drm: kerneldoc for GuC
>>   drm/i915: Enable GuC submission, where supported
>>
>> In the final section, we implement the GuC submission mechanism, link
>> it into the (execlist-based) submission path, and finally enable it
>> (on supported platforms). On platforms where there is no GuC, or if
>> the GuC firmware cannot be found or is invalid, batch submission will
>> revert to using the execlist mechanism directly.
> 
> I thought we had some perf data showing that GuC is now faster than
> execbuf ... Where's that?

Alex has run some benchmarks, generally showing a small improvement, up
to about 5% depending on workload. OTOH John H knows of one application
that improved by more than 20% :)

>> The GuC firmware itself is not included in this patchset; it is or will
>> be available for download from https://01.org/linuxgraphics/downloads/
>> This driver works with and requires GuC firmware revision 3.x. It will
>> not work with any firmware version 1.x, as the GuC protocol in those
>> revisions was incompatible and is no longer supported.
>>
>> Prerequisites: GuC submission will expose existing inadequacies in
>> some of the existing codepaths unless certain other patches are applied.
>> In particular we will require some version of Michel Thierry's patch
>>   drm/i915/lrc: Update PDPx registers with lri commands
>> (because the GuC support light-restore, which execlist mode doesn't),
>> and my own 
>>   drm/i915: Allocate OLR more safely (workaround until OLR goes away)
>> because otherwise the changed timing means that there is an increased
> 
> s/timing/much reduced ring space I presume?

I think it's more likely timing, but of course it depends on total
system activity as to what happens to be pinned (and therefore kmapped)
at any particular instant.

>> risk of writing to a ringbuffer that is not currently pinned & mapped,
>> causing a kernel OOPS.
> 
> Cheers, Daniel

New version incorporating all feedback should appear later today. It
would probably have been yesterday were there not conflicts between
"drm/i915: Defer default hardware context initialisation until first
open" and one of the AntiOLR patches which also splits init_hw() :(

.Dave.



More information about the Intel-gfx mailing list