[Intel-gfx] [PATCH 2/4] drm/i915/gt: Wait for CSB entries on Tigerlake

Greg KH gregkh at linuxfoundation.org
Wed Sep 16 08:35:01 UTC 2020


On Wed, Sep 16, 2020 at 09:26:58AM +0100, Chris Wilson wrote:
> Quoting Greg KH (2020-09-16 07:33:58)
> > On Tue, Sep 15, 2020 at 01:41:48PM +0100, Chris Wilson wrote:
> > > On Tigerlake, we are seeing a repeat of commit d8f505311717 ("drm/i915/icl:
> > > Forcibly evict stale csb entries") where, presumably, due to a missing
> > > Global Observation Point synchronisation, the write pointer of the CSB
> > > ringbuffer is updated _prior_ to the contents of the ringbuffer. That is
> > > we see the GPU report more context-switch entries for us to parse, but
> > > those entries have not been written, leading us to process stale events,
> > > and eventually report a hung GPU.
> > > 
> > > However, this effect appears to be much more severe than we previously
> > > saw on Icelake (though it might be best if we try the same approach
> > > there as well and measure), and Bruce suggested the good idea of resetting
> > > the CSB entry after use so that we can detect when it has been updated by
> > > the GPU. By instrumenting how long that may be, we can set a reliable
> > > upper bound for how long we should wait for:
> > > 
> > >     513 late, avg of 61 retries (590 ns), max of 1061 retries (10099 ns)
> > > 
> > > Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2045
> > > References: d8f505311717 ("drm/i915/icl: Forcibly evict stale csb entries")
> > 
> > What does "References:" mean?  Should that be "Fixes:"?
> 
> It's a reference to an earlier w/a for a previous generation for the
> same symptoms. This patch should supplement that w/a.

I see no such "reference" to that tag in
Documentation/process/submitting-patches.rst, so how were we supposed to
know this?  :)

thanks,

greg k-h


More information about the Intel-gfx mailing list