[Intel-gfx] [RFC]: Arbitrated system memory bandwidth workarounds implementation for watermark.
Mahesh Kumar
mahesh1.kumar at intel.com
Mon Mar 27 15:52:45 UTC 2017
*Arbitrated system bandwidth workarounds for watermark.*
All GEN-9 based platforms require watermark related WA to be enabled if
Display memory bandwidth requirement is exceeding XX% of total available
system memory bandwidth.
This XX% depend on multiple factors.
*e.g.* if all the enabled planes have X-tiled or linear memory then,
XX = 60
if any Y-tiled plane is enabled then
XX = 20 etc.
In current implementation of workarounds we enable maximum WA (i.e. add
15us latency during WM calculation) irrespective of workaround is
required OR not.
total display bandwidth requirement is sum of display requirement of
individual pipe, In order to calculate correct BW requirement plane
configuration of any pipe should not be changing during calculation.
To implement & optimize above requirement many implementations are
possible, I'm proposing few of options.
Please review & let know which option is better to implement WA's.
*Option 1:*
Use connection_mutex (this will change to i915 specific lock only
that is available in atomic design) to serialize all the commits.
If memory bandwidth WA is changing then get all crtc_states for
calculating watermark values.
*Pros:*
* In each flip optimum WM values (not more than the required
value) will be used.
*Cons:*
* This approach will serialize all the flips so there will be
performance impact, in case of blocking commits this impact will
be even worse, e.g. three display with refresh-rate of 30fps,
60fps & 90fps.
* If commit is going-on in 30FPS display, all other flip will be
blocked & frames in 60 & 90fps display will be dropped/blocked.
*
Option 2:*
Use two levels of system bandwidth check, once during calculation &
second during commit.
During intel_atomic_check (as part of compute_ddb) dont hold any
system level mutex, instead hold WM mutex & compute system bandwidth
requirement. If WA is changing then get crtc_state of all other
pipes & go ahead with commit.
During intel_atomic_commit, again take wm_mutex & recalculate
complete system bandwidth requirement. If requirement is changed in
a way that computed WM are not valid anymore fail the flip.
Update the bandwidth requirement for each plane in global state
(dev_priv->wm) so other flips dont need to recalculate it.
*Pros:*
* It reduces critical section time.
* Still optimum use of available DDB & optimum WM values are used.
*Cons:*
* If memory bandwidth WA are changing very frequently then there
will be many flip failures which will impact the performance.
*Option 3:*
Compute maximum bandwidth requirement during modeset.
i.e. if modeset is of 1080p @60fps & maximum plane in CRTC are 3,
with maximum supported downscale amount XX.YY (defined by min of
cdclk/crtc_clock & max(hscale x vscale)) then max bandwidth
requirement for CRTC will be
(1080p x 60 x 3 x XX.YY).
Now during flip if there is any change which will change the WA
(e.g. tiling change) then take wm_mutex lock & recalculate complete
bandwidth requirement. If WA is changing then get crtc_state of all
other pipes & go ahead with commit. (if total display memory BW %
is less than lowest % to enable WA i.e. 20%, then no need to recompute)
Update per-CRTC bandwidth requirement in global state so other flips
dont need to recalculate each time.
*Pros:*
* All CRTC can flip independently until there is change which will
impact WA.
* No locking until potential WM WA change.
*Cons:*
* If memory bandwidth WA is changing very frequently then there
will be slight performance impact.
* We may not be programming optimum WM values, which may have some
power impact.
If you think any other approach should be used please let know that as well.
Regards,
-Mahesh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx/attachments/20170327/425e97e0/attachment-0001.html>
More information about the Intel-gfx
mailing list