BinaryDataContainer swapping ? ...

Mon Apr 3 12:04:05 UTC 2023

Hi,

On 03/04/2023 19:20, Noel Grandin wrote:
>
>
> On Mon, 3 Apr 2023 at 10:50, Michael Meeks 
> <michael.meeks at collabora.com> wrote:
>
>             Anyhow - I was doing some re-factoring of the
>     BinaryDataContainer to encapsulate it better; and I was thinking
>     of adding the ability to swap these out (which should be async),
>     and to then read them back in on demand which (I hope) in the era
>     of SSDs - should be extremely fast, although synchronous.
>
>
> My view of this, is that
>
> (a) we have had virtual memory for several decades now
> (b) the OS already has a swap file, and optimised paths for dealing 
> with that
>
> So we should just dump all of this explicit swapping to disk we do and 
> let the OS do its thing.

That's the current state and I would agree, but it seems there are 
memory issues in certain situations still so something needs to be done 
here (I wonder what this images are first - a 20MB JPEG file is for a 
45MP photo - can't imagine a lot of those in documents).

One of the issues with letting the OS deal with all that is that the OS 
has no idea what and when it can swap out - it just uses LRU when there 
is a memory pressure, or not. We can do it much more effectively and do 
less work, for example not keep it in the memory in the first place, but 
just copy it to the disk storage first (on document load), because we 
know the image is not yet used and may not be (if the document is large 
and the user never gets to the image for example). We can also 
effectively choose what to swap out and what not depending on the type 
and size of the image.

Then the case is also that we don't need to swap out images all the 
time, like the OS would do. We only read it from the disk, because we 
can remove it from the disk storage only when the image is un-loaded 
from the model (after document is closed for example). Also we can keep 
the compressed version in memory unless we hit a high memory usage, but 
combined with mip-mapping we can just keep a low resolution version 
in-memory for the remaining of the time and combined it could use only a 
fraction of the memory. the larger the file the more memory you can save 
with mip-mapping, which is nice.

I did many refactoring how Graphic stores the binary data (many even on 
my own time) so we don't have unnecessary copies of the data all around 
and to make swapping possible relatively easily now, and prepare it for 
a nicer on-demand loading and other ideas.

Tomaž
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/libreoffice/attachments/20230403/db5e733c/attachment.htm>