tdf#50916 : Calc : Dynamic column container

Markus Mohrhard markus.mohrhard at googlemail.com
Fri Oct 9 15:05:11 PDT 2015


Hey Dennis,

so I suppose it is my turn to respond. Maybe Eike and/or Kohei will jump in
as well.

On Fri, Oct 9, 2015 at 2:14 AM, Dennis Francis <dennisfrancis.in at gmail.com>
wrote:

> Hi All
>
> I have made an attempt at making the column container relatively dynamic in
> the hope of increasing the column limit to 16K. The patch I have made (
> https://gist.github.com/dennisfrancis/ba7254405f77282214bb#file-lo16kcols-patch
> ) does not have the required "big" refactoring, and have just enough
> changes to make everything compile and pass
> the unit tests. I was not sure whether to put this very incomplete version
> in gerrit.
>
> The underlying data structure is now a std::vector whose size can
> dynamically move between
> 1K(min) and 16K(max). This is wrapped in a template class called
> ScColContainer which could be
> used in places where fixed size (MAXCOLCOUNT) arrays are used. The
> underlying data structure
> can be changed relatively easily inside this template class without
> breaking the methods/operator
> signatures.
>
> Before I begin further refactoring I would like to discuss the various
> possibilities of data structures
> which you already have in mind, that could be used here instead of just a
> std::vector, and their merits, demerits etc..
>
> In the current approach, the plan is to have the same performance
> guarantees for sheets upto 1K cols and for sheets greater 1K, allocate only
> the required number of columns and only iterate over the column range 0 to
> largest column present actually in the sheet, instead of iterating over a
> fixed col range (0 to MAXCOL).
>
> Please point out any deficiencies, blockades with my current approach and
> indicate better methods.
>
> Thanks for you time.
>
> Regards,
> Dennis
>
>
>
This is surely not what we had in mind. Basically you just wrote a wrapper
around std::vector.

There are a number of items a new design need:

* decision whether to store all ScColumn instances or only filled ones

* a way to handle the increased memory load
** most likely limiting the number of initial columns

* a way to handle the performance impact of many columns
** most likely improving the iterations through all columns

* a better way to handle row formattings
** what happens if someone marks a whole row and formats it
** how to handle formatting for columns that were allocated after the user
formatted the whole row

* most likely a way to store the last formatted column and the last column
with content
** needs some inspecting which loops through the columns need which of the
information


Most likely a few more that I have not in mind right now.

A simple high level design could be to use a std::vector<ScColumn*> and
only allocate columns that really contain content or formatting.
Additionally we'd need to introduce a way to store the format of a row.
These row formats might not be visible to the user and would be just an
internal way to handle the formatting for all not yet allocated columns.

Of course there might be other designs that work as well or better but any
design needs to deal with at least the problems mentioned above.

Regards,
Markus
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/libreoffice/attachments/20151010/f6c89c05/attachment.html>


More information about the LibreOffice mailing list