[Intel-gfx] [PATCH] drm/i915/guc: Don't make assumptions while getting the lrca offset

Wed Jul 12 00:00:02 UTC 2017

Quoting Michel Thierry (2017-07-12 00:01:04)
> On 7/11/2017 3:37 PM, Chris Wilson wrote:
> > Quoting Michel Thierry (2017-07-11 22:29:39)
> >> Using the HWSP ggtt_offset to get the lrca offset is only correct if the
> >> HWSP happens to be before it (when we reuse the PPHWSP of the kernel
> >> context as the engine HWSP). Instead of making this assumption, get the
> >> lrca offset from the kernel_context engine state.
> >>
> >> And while looking at this part of the GuC interaction, it was also
> >> noticed that the firmware expects the size of only the engine context
> >> (context minus the execlist part, i.e. don't include the first 80
> >> dwords), so pass the right size.
> >>
> >> Reported-by: Daniele Ceraolo Spurio <daniele.ceraolospurio at intel.com>
> >> Signed-off-by: Michel Thierry <michel.thierry at intel.com>
> >> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio at intel.com>
> >> Cc: Michal Wajdeczko <michal.wajdeczko at intel.com>
> >> Cc: Oscar Mateo <oscar.mateo at intel.com>
> >> ---
> >>   drivers/gpu/drm/i915/i915_guc_submission.c | 15 ++++++++++++---
> >>   1 file changed, 12 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> >> index 48a1e9349a2c..97ec11a7cc9a 100644
> >> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> >> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> >> @@ -1018,6 +1018,12 @@ static void guc_policies_init(struct guc_policies *policies)
> >>          policies->is_valid = 1;
> >>   }
> >>   
> >> +/*
> >> + * The first 80 dwords of the register state context, containing the
> >> + * execlists and ppgtt registers.
> >> + */
> >> +#define LR_HW_CONTEXT_SIZE     (80 * sizeof(u32))
> >> +
> >>   static int guc_ads_create(struct intel_guc *guc)
> >>   {
> >>          struct drm_i915_private *dev_priv = guc_to_i915(guc);
> >> @@ -1033,6 +1039,7 @@ static int guc_ads_create(struct intel_guc *guc)
> >>          struct intel_engine_cs *engine;
> >>          enum intel_engine_id id;
> >>          u32 base;
> >> +       u32 lrca_offset = LRC_PPHWSP_PN * PAGE_SIZE;
> > 
> > Make it const and try to avoid the visual breaks in line length? (Whilst
> > also trying not to gratuitously mix types.)
> 
> ok
> 
> >>   
> >>          GEM_BUG_ON(guc->ads_vma);
> >>   
> >> @@ -1062,13 +1069,15 @@ static int guc_ads_create(struct intel_guc *guc)
> >>           * engines after a reset. Here we use the Render ring default
> >>           * context, which must already exist and be pinned in the GGTT,
> >>           * so its address won't change after we've told the GuC where
> >> -        * to find it.
> >> +        * to find it. GuC will skip the PPHWSP and 'Execlist Context',
> >> +        * thus reduce the size by 1 page (PPHWSP) and 80 dwords.
> >>           */
> >>          blob->ads.golden_context_lrca =
> >> -               dev_priv->engine[RCS]->status_page.ggtt_offset;
> >> +               guc_ggtt_offset(dev_priv->kernel_context->engine[RCS].state) + lrca_offset;
> >>   
> >> +       /* context_size - PPHWSP - first 80 dwords */
> >>          for_each_engine(engine, dev_priv, id)
> >> -               blob->ads.eng_state_size[engine->guc_id] = engine->context_size;
> >> +               blob->ads.eng_state_size[engine->guc_id] = engine->context_size - lrca_offset - LR_HW_CONTEXT_SIZE;
> > 
> > The first LRC_PPHWSP_PN pages are not included in engine->context_size,
> > they are added on by execlist_context_deferred_alloc().
> > 
> 
> I think there is only one extra page added in 
> execlists_context_deferred_alloc, and it is the "GuC shared data". And 
> yes, we don't include that one in in the context_size.
> (also __intel_engine_context_size has a comment saying the size includes 
> the HWSP).

Oh, I see what my confusion is. You are using lrca_offset for two very
different meanings just because they are the same value. (And also the
use of LRC_PPHWSP_PN for execlists_context_deferred_alloc() has the same
dubious semantics, which lead me to think that PPHWSP couldn't possibly
mean per-process hwsp anymore!) The first is an offset from the start of
the allocation to the HWSP, which is the correct meaning of LRC_PPHWSP_PN
(that is, its page index). In second case, you just mean PAGE_SIZE to
remove the single page being skipped. Please, I am easily confused!
That does strongly suggest the comment needs an actual sentence and not
just rehashing of the code:

/*
 * The GuC expects us to exclude the portion of the context image that
 * it skips from the size it is to read. It starts reading from after
 * the execlist context (so skipping the first page [PPHWSP] and 80
 * dwords). Weird guc is weird.
 */
for_each_engine(engine) {
	const u32 skipped = PAGE_SIZE + LR_HW_CONTEXT_SIZE;
	blob->ads.eng_state_size[engine->guc_id] = engine->context_size - skipped;
}

Also feel free to try and fixup the other confusion in intel_lrc.c.
Perhaps something like:

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index b0738d2..f498aa6 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -2031,8 +2031,9 @@ static int execlists_context_deferred_alloc(struct i915_gem_context *ctx,
 
        context_size = round_up(engine->context_size, I915_GTT_PAGE_SIZE);
 
-       /* One extra page as the sharing data between driver and GuC */
-       context_size += PAGE_SIZE * LRC_PPHWSP_PN;
+       /* Add an extra page at the start for sharing data with the GuC */
+       BUILD_BUG_ON(LRC_PPHWSP_PN != 1);
+       context_size += PAGE_SIZE;
 
        ctx_obj = i915_gem_object_create(ctx->i915, context_size);
        if (IS_ERR(ctx_obj)) {