tdf#50916 : Calc : Dynamic column container

Dennis Francis dennisfrancis.in at gmail.com
Sat Oct 10 06:30:11 PDT 2015


Hi Markus

Thanks for your valuable feedback !

1.
"""
A simple high level design could be to use a std::vector<ScColumn*> and
only allocate columns that really contain content or formatting.
Additionally we'd need to introduce a way to store the format of a row.
These row formats might not be visible to the user and would be just an
internal way to handle the formatting for all not yet allocated columns.
"""

Let me make sure I get this idea right -

The aCol member in ScTable is declared as :
*std::vector<ScColumn*> aCol;*  // Or can have wrapper on it own
and in the ScTable ctor aCol is init with a minimum number (or possibly
16K) of NULL pointer elements ie no ScColumn instance is allocated yet.
If we allocate space for all pointers in aCol vector then I think it might
take around 128 KB just for pointers,
(point 1.A) so might want to allocate that too just when it is needed,
hence avoid the need to keep a int var for keeping the index of last non
null element.

While loading a sheet or user types in something in ith column for the
first time, allocate ScColumn


*if ( ! aCol[i] ){*

*  aCol[i] = new ScColumn;*

*  // Apply any full row formattings to aCol[i]*

*}*
2.
Correct me if I am wrong - I guess the formatting of each cell in a column
is stored in pAttrArray member of ScColumn. So for storing full row
formatting, we need
a data structure that map from row number to the formatting info.
Definitely need your inputs and code pointers on this.



This is surely not what we had in mind. Basically you just wrote a wrapper
> around std::vector.
>
> There are a number of items a new design need:
>
> * decision whether to store all ScColumn instances or only filled ones
>

3.
If we use just std::vector<ScColumn>, we have to allocate n ScColumns if
there are totally n columns in the sheet including blank ones upto the last
non-blank column.
If we use std::vector<ScColumn*> we need to allocate ScColumn's for only
the non blank columns in the document - definitely better.


> * a way to handle the increased memory load
> ** most likely limiting the number of initial columns
>

See point 1.

4.

> * a way to handle the performance impact of many columns
> ** most likely improving the iterations through all columns
>

If we use std::vector<ScColumn>, We can trim the for loops to run from 0 up
to current length of aCol.
If we use std::vector<ScColumn*>, We can trim the for loops to run from 0
up to current length of aCol and also skip running the inner logic whenever
aCol[i] == NULL, there by saving time, but not sure we can convert all
column for loops like that offhand.


>
> * a better way to handle row formattings
> ** what happens if someone marks a whole row and formats it
> ** how to handle formatting for columns that were allocated after the user
> formatted the whole row
>
> See point 2.


> * most likely a way to store the last formatted column and the last column
> with content
> ** needs some inspecting which loops through the columns need which of the
> information
>
> See point 1.A, using this we can save time wasted on iterating over blank
columns(aCol[i] == NULL).
Could you please explain why we need to remember "last formatted column" ?


> Most likely a few more that I have not in mind right now.
>

Will surely wait for others to comment before proceeding any further.


Thanks,
Dennis

www.ldcs.co.in
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/libreoffice/attachments/20151010/fec5d8c9/attachment.html>


More information about the LibreOffice mailing list