[Intel-gfx] [RFC] drm/i915: Temporarily go realtime when polling PCODE

Imre Deak imre.deak at intel.com
Thu Feb 23 17:02:59 UTC 2017


On Thu, Feb 23, 2017 at 01:00:32PM +0000, Tvrtko Ursulin wrote:
> 
> On 23/02/2017 12:01, Imre Deak wrote:
> >On Thu, Feb 23, 2017 at 09:37:29AM +0000, Tvrtko Ursulin wrote:
> >>[...]
> >>Having read the spec I think I see both sides now.
> >>
> >>Spec is actually suggesting we should busy-retry the pcode request for 3ms
> >>in this case.
> >
> >Well, retry for 3ms without setting any minimum for the number of
> >requests. That couldn't be guaranteed anyway due to scheduling etc, and
> >would be a strange ABI. Later Art Runyan clarified this in the way it's
> >described in the code comment: What is required is two requests at
> >least 3ms apart. The first request is queued by the firmware and the
> >second request signals completion.
> 
> Why is our loop then spamming the hardware every 10us with requests? Perhaps
> it could be counter-productive? A single sleeping loop with a long timeout
> and a 3ms period wouldn't work? Like:
> 
> 	ret = _wait_for(COND, 50 * 1000, timeout_base_ms * 1000)
> 
> ?

The tight loop there is since according to the spec the request "typically"
completes < 200us, so we want to benefit from fast completions.

By single loop, do you mean without the initial 'if (COND) ...' or
without the WA loop? If you meant without 'if (COND)', then technically
that would still allow _wait_for() to check COND only once. I know, very
unlikely with your 50ms timeout above.

If you meant without the WA loop then we still need that since the 'two
requests 3ms apart' above is just how the ABI should work based on
feedback from PCODE people so far. There are occasional timeouts due to a
glitch somewhere (I believe in the firmware) which requires that we run
the WA with the burst requests. That part is an empricial solution based
on our own tests. I hope to get a more official explanation for these
and we can get rid of the WA, replacing it with something more robust.

> >>It doesn't say how many retries we are supposed to do and how it internally
> >>operates, which makes me unsure if our first more relaxed polling is perhaps
> >>causing or contributing to the issue.
> >>
> >>One thing where we don't follow the spec is the timeout for the
> >>GEN6_PCODE_READY poll which spec says should be 150us and not 500ms. I don't
> >>know if this timeout was trigger in the bug reports?
> >
> >No this PCODE_READY poll always succeeds, it's the reply/reply_mask
> >response which doesn't get set in time.
> 
> Yes I know, I was just thinking if it takes more than 2us it then falls back
> to scheduling & usleep_range. That was at the time I was thinking it is
> really important to poll rapidly. Since you explained above it is just the
> opposite I agree this part is not a problem. It still may make sense to wait
> for that bit for a shorter period as per bspec.

It goes back to SNB, so it'd need to be checked against platforms since
then and other PCODE requests. We'd bail out from the poll in case of
the first timeout, so in that sense it's not a problem, but I agree it'd
make sense to document it better.

> [snip]
> 
> >>But regardless, the fact that the fallback busy loop needs up to 34ms as
> >>well makes the last bit from the above a bit uncertain. Only if the
> >>non-compliant polling we do somehow confuses the hardware and then we end up
> >>having to busy poll longer than we normally would. Probably unlikely.
> >
> >I'm trying to get more info based on all this (in particular the KBL
> >problem) from Art. Until that I'd suggest increasing the WA timeout to
> >50ms, since that solved the problem for the bug reporter. We could fix
> >things/add more scaffolding if more evidence comes up, or there is a new
> >bug report.
> 
> Yes sure I think I replied before that it is fine by me to push a 50ms fix
> for stable.

Ok, will send it.

--Imre


More information about the Intel-gfx mailing list