[PATCH v3 10/10] drm/i915/psr: Allow DSB usage when PSR is enabled
Hogander, Jouni
jouni.hogander at intel.com
Mon Jan 20 07:28:52 UTC 2025
On Sat, 2025-01-18 at 01:07 +0200, Ville Syrjälä wrote:
> On Fri, Jan 17, 2025 at 10:20:17PM +0200, Ville Syrjälä wrote:
> > On Thu, Jan 09, 2025 at 09:31:37AM +0200, Jouni Högander wrote:
> > > Now as we have correct PSR2_MAN_TRK_CTL handling in place we can
> > > allow DSB
> > > usage also when PSR is enabled for LunarLake onwards.
> >
> > We seem to still lack an answer as to when the PSR wakes, when it
> > latches the update, and how does all that guarantee that the DSB
> > interrupt fires after the update has been latched?
> >
> > Some thoughts as to how to figure this out:
> > 1. make sure we're in PSR
> > 2. sample TIMESTAMP_CTR
> > 3. start DSB in which
> > write PLANE_SURF with a new value
> > send push
> > wait for vblank
> > poll PLANE_SURFLIVE == new value
> > fire interrupt
> >
> > in the interrupt handler:
> > sample TIMESTAMP_CTR again
> >
> > And then compare flip timestmap vs. frame timestamp vs. the
> > manually sampled timestamps. And then repeat without the SURFLIVE
> > poll to make sure nothing has changed. You'll need to be careful
> > to make sure it will actually poll for long enough to make a real
> > difference (if the poll actuall is needed), but tweaking the poll
> > interval+count suitably. I don't remeber what the max poll
> > count was, but IIRC it wasn't too high so the duration will have
> > to get bumped for long polls.
> >
> > I guess one could also try to poll for the actual PSR status,
> > but dunno how well that'll work.
> >
> > And we could also try to come up with different ideas on where
> > to sample timestamps. Unfortunately we only have the single
> > pipe flip timestamp register so we can only sample one timestamp
> > from the DSB itself per frame. If we had more we could much more
> > easily figure things out :/
> >
> > I pushed my latest DSB selftest stuff to
> > https://github.com/vsyrjala/linux.git dsb_selftests_7
> > which has a bunch of stuff for this kind of experimentation.
> > It's in a somewhat sorry state at the moment since I last used
> > to hunt for various DSB bugs, but at least it still builds :)
> >
> > The way I use that is that I run igt 'testdisplay -o ...'
> > to make sure nothing else is actively poking the hardware
> > and then I trigger the DSB tests via debugfs.
>
> I poked around a bit, though only on a TGL+PSR1 system (what I had
> at hand), so some of this might not apply to PSR2 and/or more
> modern platforms.
>
> General notes:
> - PSR1 exit is triggered by any pipe/plane register write (even the
> non-arming ones)
This is same for PSR2 as well.
>
> We basically have three cases to consider here:
> 1. PSR1 is currently inactive
> Obviously everything should be with the current code,
> vblank evasion works, wait for vblank+interrupt scheme
> for flip completion works
>
> 2. PSR1 is active, but DC states are not
> The wakeup latency here is super quick (it's < 1 scanline, how
> much below? I've not yet measured), and arming registers do latch
> nearly immediately. The scanline counter starts counting
> accordingly
> from vblank_start-1. However the hardware still considers PSR to be
> active for that short duration so DSB_SKIP_WAITS_EN _will_ skip the
> waits.
>
> Unfortunately being this quick I'm not convinced we have enough
> time
> to write all the registers atomically before the hardware latches
> something. So I'm thinking we may need to remove DSB_SKIP_WAITS_EN,
> in which case the vblank evasion will push the arming register
> writes into the next frame. This will mean the wakeup will take
> one full frame.
To my understanding DSB_SKIP_WAITS_EN have impact only when in SRD
(PSR)/DEEP_SLEEP(PSR2). I.e. In this scenario we still do have all
waits as in commit without PSR.
>
> 3. PSR1 is active and so are DC states
> The wakeup latency is ~5ms. During that time scanline counter reads
> 0, PSR is active for the purposes of DSB_SKIP_WAITS_EN. Again we
> pretty much need DSB_SKIP_WAITS_EN=0 here to make sure the
> interrupt
> gets signalled after the wait for vblank. vblank evasion will get
> skipped on account the scanline being 0. Somewhat ironically this
> would give us ~5ms total wakeup latency which is now faster than
> the
> previous case.
Again my understanding is that DSB_SKIP_WAITS_EN=0/1 have impact only
when in SRD/DEEP_SLEEP. There is SRD_CTRL/PSR2_CTRL[Idle Frames] to
control when entering SRD/DEEP_SLEEP. So wait for vblank is supposed to
work normally in this scenario as well because there has to be at least
one idle frame before entering sleep.
>
> So everything here would be fine if we know that the wakeup has
> just
> started since we have all of that 5ms to write all the registers.
> But I guess we can't really know when the wakeup started, so we
> might be doing the vblank evasion just before the scanline counter
> starts to read its proper value, and then we have that < 1 scanline
> to write all the arming registers. Not sure if its enough. If not,
> then we could also explicitly evade scanline 0 as well, which would
> again force all arming register writes into the next frame giving
> us the same kind of single frame wakeup latency as in case 2.
hmm, what else could trigger the wakeup than the commit that is on hand
? Frontbuffer flush/legacy cursor update? It begins to look like we
still need to take that mutex over DSB commit...
>
> IIRC you said that you had stuff get stuck with DSB_SKIP_WAITS_EN=0.
> Was that with hardware that has TRANS_PUSH for PSR?
It has TRANS_PUSH, but not using it for PSR. I.e. On LunarLake and
still using register writes as a trigger for Frame Change event towards
PSR.
It happened with DSB commit where scanline wait was the first thing.
I.e. only cursor plane updating.
> I suppose that
> could happen if the scanline counter already reads something
> (eg. again vblank_start-1) before we've done the push, that would
> cause the vblank evasion to wait forever. I could see two ways to
> perhaps handle that:
> - if DSB_SKIP_WAITS_EN becomes inactive immediately after
> TRANS_PUSH then we could just keep DSB_SKIP_WAITS_EN=1 all
> the time and things should be fine
> - if DSB_SKIP_WAITS_EN stays active for some time after TRANS_PUSH
> then we'll perhaps need to poke that bit from the DSB itself
> dynamically so that we will skip the vblank evasion, but
> not the wait for blank prior to generating the interrupt.
> But this needs a bit more reverse engineering for sure.
TRAN_PUSH is still not enabled in this patch set.
BR,
Jouni Högander
>
> >
> > >
> > > Signed-off-by: Jouni Högander <jouni.hogander at intel.com>
> > > ---
> > > drivers/gpu/drm/i915/display/intel_display.c | 3 ++-
> > > 1 file changed, 2 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/display/intel_display.c
> > > b/drivers/gpu/drm/i915/display/intel_display.c
> > > index e448ff64660a..58575800fad2 100644
> > > --- a/drivers/gpu/drm/i915/display/intel_display.c
> > > +++ b/drivers/gpu/drm/i915/display/intel_display.c
> > > @@ -7631,6 +7631,7 @@ static void intel_atomic_dsb_finish(struct
> > > intel_atomic_state *state,
> > > intel_atomic_get_old_crtc_state(state, crtc);
> > > struct intel_crtc_state *new_crtc_state =
> > > intel_atomic_get_new_crtc_state(state, crtc);
> > > + struct intel_display *display = to_intel_display(crtc);
> > >
> > > if (!new_crtc_state->hw.active)
> > > return;
> > > @@ -7643,7 +7644,7 @@ static void intel_atomic_dsb_finish(struct
> > > intel_atomic_state *state,
> > > new_crtc_state->update_planes &&
> > > !new_crtc_state->vrr.enable &&
> > > !new_crtc_state->do_async_flip &&
> > > - !new_crtc_state->has_psr &&
> > > + (DISPLAY_VER(display) >= 20 || !new_crtc_state-
> > > >has_psr) &&
> > > !new_crtc_state->scaler_state.scaler_users &&
> > > !old_crtc_state->scaler_state.scaler_users &&
> > > !intel_crtc_needs_modeset(new_crtc_state) &&
> > > --
> > > 2.43.0
> >
> > --
> > Ville Syrjälä
> > Intel
>
More information about the Intel-xe
mailing list