[Intel-gfx] [PATCH] drm/i915/guc: Propagate the fw xfer timeout

Chris Wilson chris at chris-wilson.co.uk
Thu Oct 18 09:13:29 UTC 2018


Quoting Michal Wajdeczko (2018-10-18 00:22:43)
> On Thu, 18 Oct 2018 01:09:19 +0200, Daniele Ceraolo Spurio  
> <daniele.ceraolospurio at intel.com> wrote:
> 
> >
> >
> > On 17/10/18 13:29, Chris Wilson wrote:
> >> Propagate the timeout on transferring the fw back to the caller where it
> >> may act upon it, usually by restarting the xfer before failing.
> >>
> >
> > Did you see any case where we failed the xfer and didn't get a timeout  
> > out of guc_wait_ucode? that'd be quite weird

Yes the logs show the xfer error but not the wait error. So we ended up
returning 0.

> > Anyway, definitely better and cleaner than before:
> > Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio at intel.com>
> >
> > small nitpick below
> >
> >> Testcase: igt/drv_selftest/live_hangcheck
> >> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> >> Cc: Michal Wajdeczko <michal.wajdeczko at intel.com>
> >> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio at intel.com>
> >> ---
> >>   drivers/gpu/drm/i915/intel_guc_fw.c | 23 ++++++-----------------
> >>   1 file changed, 6 insertions(+), 17 deletions(-)
> >>  diff --git a/drivers/gpu/drm/i915/intel_guc_fw.c  
> >> b/drivers/gpu/drm/i915/intel_guc_fw.c
> >> index a9e6fcce467c..b68a05892dab 100644
> >> --- a/drivers/gpu/drm/i915/intel_guc_fw.c
> >> +++ b/drivers/gpu/drm/i915/intel_guc_fw.c
> >> @@ -125,22 +125,17 @@ static void guc_prepare_xfer(struct intel_guc  
> >> *guc)
> >>   }
> >>     /* Copy RSA signature from the fw image to HW for verification */
> >> -static int guc_xfer_rsa(struct intel_guc *guc, struct i915_vma *vma)
> >> +static void guc_xfer_rsa(struct intel_guc *guc, struct i915_vma *vma)
> >>   {
> >>      struct drm_i915_private *dev_priv = guc_to_i915(guc);
> >> -    struct intel_uc_fw *guc_fw = &guc->fw;
> >> -    struct sg_table *sg = vma->pages;
> >>      u32 rsa[UOS_RSA_SCRATCH_COUNT];
> >>      int i;
> >>   -  if (sg_pcopy_to_buffer(sg->sgl, sg->nents, rsa, sizeof(rsa),
> >> -                           guc_fw->rsa_offset) != sizeof(rsa))
> >> -            return -EINVAL;
> >> +    sg_pcopy_to_buffer(vma->pages->sgl, vma->pages->nents,
> >> +                       rsa, sizeof(rsa), guc->fw.rsa_offset);
> >>      for (i = 0; i < UOS_RSA_SCRATCH_COUNT; i++)
> >>              I915_WRITE(UOS_RSA_SCRATCH(i), rsa[i]);
> >> -
> >> -    return 0;
> >>   }
> >>     /*
> >> @@ -251,17 +246,11 @@ static int guc_fw_xfer(struct intel_uc_fw  
> >> *guc_fw, struct i915_vma *vma)
> >>       * by the DMA engine in one operation, whereas the RSA signature is
> >>       * loaded via MMIO.
> >>       */
> >> -    ret = guc_xfer_rsa(guc, vma);
> >> -    if (ret)
> >> -            DRM_WARN("GuC firmware signature xfer error %d\n", ret);
> >> +    guc_xfer_rsa(guc, vma);
> >>      ret = guc_xfer_ucode(guc, vma);
> >> -    if (ret)
> >> -            DRM_WARN("GuC firmware code xfer error %d\n", ret);
> >> -
> >> -    ret = guc_wait_ucode(guc);
> >> -    if (ret)
> >> -            DRM_ERROR("GuC firmware xfer error %d\n", ret);
> >
> > With the logs gone we don't have any more error message to understand  
> > where exactly we hit issue (xfer vs wait_ucode). We do have debug logs  
> > printing the status that would give the info, but might be worth  
> > retaining at least 1 error-level log.
> 
> +1 for leaving error-level logs (here or at helper functions)
> also, please let CI re-run this patch with GuC enabled ;)

I disagree since the errors are noise as the error is propagated and
handled by the caller -- so not strictly errors, but maybe
DRM_DEBUG_DRIVER().
-Chris


More information about the Intel-gfx mailing list