[Intel-gfx] [RFC PATCH 00/97] Basic GuC submission support in the i915

Jason Ekstrand jason at jlekstrand.net
Fri May 14 16:36:37 UTC 2021


On Fri, May 14, 2021 at 6:12 AM Tvrtko Ursulin
<tvrtko.ursulin at linux.intel.com> wrote:
>
> On 06/05/2021 20:13, Matthew Brost wrote:
> > Basic GuC submission support. This is the first bullet point in the
> > upstreaming plan covered in the following RFC [1].
> >
> > At a very high level the GuC is a piece of firmware which sits between
> > the i915 and the GPU. It offloads some of the scheduling of contexts
> > from the i915 and programs the GPU to submit contexts. The i915
> > communicates with the GuC and the GuC communicates with the GPU.
> >
> > GuC submission will be disabled by default on all current upstream
> > platforms behind a module parameter - enable_guc. A value of 3 will
> > enable submission and HuC loading via the GuC. GuC submission should
> > work on all gen11+ platforms assuming the GuC firmware is present.
>
> Some thoughts mostly relating to future platforms where GuC will be the
> only option, and to some extent platforms where it will be possible to
> turn it on for one reason or another.
>
> Debuggability - in the context of having an upstream way/tool for
> capturing and viewing GuC logs usable for attaching to bug reports.
>
> Currently i915 logs, can provide traces via tracepoints and trace
> printk, and GPU error capture state, which provides often sufficient
> trail of evidence to debug issues.
>
> We need to make sure GuC does is not a black box in this respect. By
> this I mean it does not hide a large portion of the execution flows from
> upstream observability.

I agree here.  If GuC suddenly makes submission issues massively
harder to debug then that's a regression vs. execlists.  I don't know
what the solution there is but I think the concern is valid.

> This could mean a tool in IGT to access/capture GuC logs and update bug
> filing instructions.
>
> Leading from here is probably the need for the GuC firmware team to
> cross the internal-upstream boundary and deal with such bug reports on
> upstream trackers. Upstream GuC is unlikely to work if we don't have
> such plan and commitment.

I mostly agree here as well.  I'm not sure it'll actually happen but
I'd like anyone who writes code which impacts Linux to be active in
upstream bug trackers.

> Also leading from here is the need for GPU error capture to be on par
> from day one which is I believe still not there in the firmware.

This one has me genuinely concerned.  I've heard rumors that we don't
have competent error captures with GuC yet.  From the Mesa PoV, this
is a non-starter.  We can't be asked to develop graphics drivers with
no error capture.

The good news is that, based on my understanding, it shouldn't be
terrible to support.  We just need the GuC to grab all the registers
for us and shove them in a buffer somewhere before it resets the GPU
and all that data is lost.  I would hope the Windows people have
already done that and we just need to hook it up.  If not, there may
be some GuC engineering required here.

> Another, although unrelated, missing feature on my wish list is firmware
> support for wiring up accurate engine busyness stats to i915 PMU. I
> believe this is also being worked on but I don't know when is the
> expected delivery.
>
> If we are tracking a TODO list of items somewhere I think these ones
> should be definitely considered.

Yup, let's get it all in the ToDo and not flip GuC on by default in
the wild until it's all checked off.

--Jason


More information about the Intel-gfx mailing list