[Mesa-dev] [PATCH 0/4] improve buffer cache and reuse
Eero Tamminen
eero.t.tamminen at intel.com
Wed May 2 11:18:21 UTC 2018
Hi,
On 02.05.2018 02:25, James Xiong wrote:
> From: "Xiong, James" <james.xiong at intel.com>
>
> With the current implementation, brw_bufmgr may round up a request
> size to the next bucket size, result in 25% more memory allocated in
> the worst senario. For example:
> Request size Actual size
> 32KB+1Byte 40KB
> .
> 8MB+1Byte 10MB
> .
> 96MB+1Byte 112MB
> This series align the buffer size up to page instead of a bucket size
> to improve memory allocation efficiency. Performances are almost the
> same with Basemark ES3, GfxBench4 and 5:
>
> Basemark ES3
> score peak memory allocation
> before after diff before after diff
> 21.537462 21.888784 1.61% 419766272 408809472 -10956800
> 19.566198 19.763429 1.00%
What memory you're measuring:
* VmSize (not that relevant unless you're running out of address space)?
* PrivateDirty (listed in /proc/PID/smaps and e.g. by "smem" tool [1])?
* total of allocation sizes used by Mesa?
Or something else?
In general, unused memory isn't much of a problem, only dirty (written)
memory. Kernel maps all unused memory to a single zero page, so unused
memory takes only few bytes of RAM for the page table entries (required
for tracking the allocation pages).
> GfxBench 4.0
> score peak memory
> before after diff before after diff
> gl_4 564.6052246094 565.2348632813 0.11% 578490368 550199296 -28291072
> gl_4_off 727.0440063477 703.5833129883 -3.33% 629501952 598216704 -31285248
> gl_manhattan 1053.4223632813 1057.3690185547 0.37% 449568768 421134336 -28434432
> gl_trex 2708.0656738281 2699.2646484375 -0.33% 130076672 125042688 -5033984
> gl_alu2 1207.1490478516 1212.2220458984 0.42% 55496704 55029760 -466944
> gl_driver2 103.0383071899 103.5478439331 0.49% 13107200 12980224 -126976
> gl_manhattan_off 1703.4780273438 1736.9074707031 1.92% 490016768 456548352 -33468416
> gl_trex_off 2951.6809082031 3058.5422363281 3.49% 157511680 152260608 -5251072
> gl_alu2_off 2604.0903320313 2626.2524414063 0.84% 86130688 85483520 -647168
> gl_driver2_off 204.0173187256 207.0510101318 1.47% 40869888 40615936 -253952
You're missing information on:
* On which plaform you did the testing (affects variance)
* how many test rounds you ran, and
* what is your variance
-> I don't know whether your numbers are just random noise.
Memory is allocated in pages from kernel, so there's no point in showing
its usage as bytes. Please use KBs, that's more readable.
(Because of randomness e.g. interactions with the windowing system,
there can be some variance also in process memory usage, which may
also be useful to report.)
Because of variance, you don't need that decimals for the scores.
Removing the extra ones makes that data a bit more readable too.
- Eero
[1] "smem" is python based tool available at least in Debian.
If you want something simpler, e.g. shell script working with
minimal shells like Busybox, you can use this:
https://github.com/maemo-tools-old/sp-memusage/blob/master/scripts/mem-smaps-private
> GfxBench 5.0
> score peak memory
> before after before after diff
> gl_5 259 259 1137549312 1038286848 -99262464
> gl_5_off 297 297 1170853888 1071357952 -99495936
>
> Xiong, James (4):
> i965/drm: Reorganize code for the next patch
> i965/drm: Round down buffer size and calculate the bucket index
> i965/drm: Searching for a cached buffer for reuse
> i965/drm: Purge the bucket when its cached buffer is evicted
>
> src/mesa/drivers/dri/i965/brw_bufmgr.c | 139 ++++++++++++++++++---------------
> src/util/list.h | 5 ++
> 2 files changed, 79 insertions(+), 65 deletions(-)
>
More information about the mesa-dev
mailing list