[Intel-gfx] [PATCH] drm/i915/guc: Propagate the fw xfer timeout

Chris Wilson chris at chris-wilson.co.uk
Thu Oct 18 19:42:41 UTC 2018


Quoting Michal Wajdeczko (2018-10-18 19:27:20)
> On Thu, 18 Oct 2018 20:18:53 +0200, Daniele Ceraolo Spurio  
> <daniele.ceraolospurio at intel.com> wrote:
> 
> >
> >
> > On 18/10/18 02:13, Chris Wilson wrote:
> >> Quoting Michal Wajdeczko (2018-10-18 00:22:43)
> >>> On Thu, 18 Oct 2018 01:09:19 +0200, Daniele Ceraolo Spurio
> >>> <daniele.ceraolospurio at intel.com> wrote:
> >>>
> >>>>
> >>>>
> >>>> On 17/10/18 13:29, Chris Wilson wrote:
> >>>>> Propagate the timeout on transferring the fw back to the caller  
> >>>>> where it
> >>>>> may act upon it, usually by restarting the xfer before failing.
> >>>>>
> >>>>
> >>>> Did you see any case where we failed the xfer and didn't get a timeout
> >>>> out of guc_wait_ucode? that'd be quite weird
> >>  Yes the logs show the xfer error but not the wait error. So we ended up
> >> returning 0.
> >>
> >
> > The guc status register is correctly cleared by HW on guc reset (just  
> > checked that), so if the wait_ucode succeeded in matching a "ready"  
> > value in there it means that the guc FW had loaded correctly. Maybe it  
> > managed to complete the xfer just after the timeout elapsed or the DMA  
> > got confused and reported the wrong status. Still weird, but your  
> > changes should help make the error more visible.
> 
> maybe first timeout in DMA transfer was mitigated by additional wait in
> guc_wait_ucode() - so maybe this error propagation is not that important ?

Yes, it looks reasonable that the guc is unlikely to signal it is
ready before the dma xfer is complete. How about only waiting for the
guc signal and asserting that the xfer is complete afterwards? (Just to
sanity check the hw state on completion.)
-Chris


More information about the Intel-gfx mailing list