[PATCH v5 0/6] A few drm_syncobj optimisations
Tvrtko Ursulin
tvrtko.ursulin at igalia.com
Wed Jun 11 14:00:51 UTC 2025
A small set of drm_syncobj optimisations which should make things a tiny bit
more efficient on the CPU side of things.
Improvement seems to be around 1.5%* more FPS if observed with "vkgears
-present-mailbox" on a Steam Deck Plasma desktop, but I am reluctant to make a
definitive claim on the numbers since there is some run to run variance. But, as
suggested by Michel Dänzer, I did do a five ~100 second runs on the each kernel
to be able to show the ministat analysis.
x before
+ after
+------------------------------------------------------------+
| x + |
| x x + |
| x xx ++++ |
| x x xx x ++++ |
| x xx x xx x+ ++++ |
| xxxxx xxxxxx+ ++++ + + |
| xxxxxxx xxxxxx+x ++++ +++ |
| x xxxxxxxxxxx*xx+* x++++++++ ++ |
| x x xxxxxxxxxxxx**x*+*+*++++++++ ++++ + |
| xx x xxxxxxxxxx*x****+***+**+++++ ++++++ |
|x xxx x xxxxx*x****x***********+*++**+++++++ + + +|
| |_______A______| |
| |______A_______| |
+------------------------------------------------------------+
N Min Max Median Avg Stddev
x 135 21697.58 22809.467 22321.396 22307.707 198.75011
+ 118 22200.746 23277.09 22661.4 22671.442 192.10609
Difference at 95.0% confidence
363.735 +/- 48.3345
1.63054% +/- 0.216672%
(Student's t, pooled s = 195.681)
Or when tested on Intel Alderlake, KDE Wayland:
x base
+ syncobj
+--------------------------------------------------------------+
| + |
| + + |
| + + |
| + ++ |
| ++ ++ |
| x ++ ++ |
| x x + ++ ++ |
| x xx xx x x +++++++ |
| x x xx xxx xxxx*xxx +++++++++ |
|x xx x x x xx xxxxxxxxxx*xxx****xxx +x+ ++++++++++|
| |__________A_M_______| |____A_M___| |
+--------------------------------------------------------------+
N Min Max Median Avg Stddev
x 55 7158.232 8058.753 7803.506 7754.5195 191.69526
+ 55 7801.23 8272.271 8172.435 8150.6303 105.84085
Difference at 95.0% confidence
396.111 +/- 57.8717
5.10813% +/- 0.746296%
(Student's t, pooled s = 154.838)
Scores may seem low but I had to fix to conservative CPU freq to avoid some
pretty strong thermal throttling causing wild swings within a run. Nevertheless
the improvement is clearly shown here as well.
v2:
* Implemented review feedback - see patch change logs.
v3:
* Moved #define DRM_SYNCOBJ_FAST_PATH_ENTRIES one patch earlier for less churn.
v3.1:
* Consolidated testing results.
v4:
* Kernel test robot reports 32-bit ARM does not implement 64-bit get/put_user.
Switch to copy_to/from_user in relevant places.
v5:
* Fixed copy_from_user argument order mixup.
Cc: Maíra Canal <mcanal at igalia.com>
Tvrtko Ursulin (6):
drm/syncobj: Remove unhelpful helper
drm/syncobj: Do not allocate an array to store zeros when waiting
drm/syncobj: Avoid one temporary allocation in drm_syncobj_array_find
drm/syncobj: Avoid temporary allocation in
drm_syncobj_timeline_signal_ioctl
drm/syncobj: Add a fast path to drm_syncobj_array_wait_timeout
drm/syncobj: Add a fast path to drm_syncobj_array_find
drivers/gpu/drm/drm_syncobj.c | 277 ++++++++++++++++++----------------
1 file changed, 147 insertions(+), 130 deletions(-)
--
2.48.0
More information about the dri-devel
mailing list