BinaryDataContainer swapping ? ...
Tomaž Vajngerl
tomaz.vajngerl at collabora.co.uk
Mon Apr 3 12:04:05 UTC 2023
Hi,
On 03/04/2023 19:20, Noel Grandin wrote:
>
>
> On Mon, 3 Apr 2023 at 10:50, Michael Meeks
> <michael.meeks at collabora.com> wrote:
>
> Anyhow - I was doing some re-factoring of the
> BinaryDataContainer to encapsulate it better; and I was thinking
> of adding the ability to swap these out (which should be async),
> and to then read them back in on demand which (I hope) in the era
> of SSDs - should be extremely fast, although synchronous.
>
>
> My view of this, is that
>
> (a) we have had virtual memory for several decades now
> (b) the OS already has a swap file, and optimised paths for dealing
> with that
>
> So we should just dump all of this explicit swapping to disk we do and
> let the OS do its thing.
That's the current state and I would agree, but it seems there are
memory issues in certain situations still so something needs to be done
here (I wonder what this images are first - a 20MB JPEG file is for a
45MP photo - can't imagine a lot of those in documents).
One of the issues with letting the OS deal with all that is that the OS
has no idea what and when it can swap out - it just uses LRU when there
is a memory pressure, or not. We can do it much more effectively and do
less work, for example not keep it in the memory in the first place, but
just copy it to the disk storage first (on document load), because we
know the image is not yet used and may not be (if the document is large
and the user never gets to the image for example). We can also
effectively choose what to swap out and what not depending on the type
and size of the image.
Then the case is also that we don't need to swap out images all the
time, like the OS would do. We only read it from the disk, because we
can remove it from the disk storage only when the image is un-loaded
from the model (after document is closed for example). Also we can keep
the compressed version in memory unless we hit a high memory usage, but
combined with mip-mapping we can just keep a low resolution version
in-memory for the remaining of the time and combined it could use only a
fraction of the memory. the larger the file the more memory you can save
with mip-mapping, which is nice.
I did many refactoring how Graphic stores the binary data (many even on
my own time) so we don't have unnecessary copies of the data all around
and to make swapping possible relatively easily now, and prepare it for
a nicer on-demand loading and other ideas.
Tomaž
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/libreoffice/attachments/20230403/db5e733c/attachment.htm>
More information about the LibreOffice
mailing list