Fw: benchmark of Excel, Calc, Google Docs

Aditya Parameswaran adityagp at berkeley.edu
Fri Dec 13 22:22:31 UTC 2019


Michael,

We'd love to meet and discuss!  Unfortunately, a lot of us are off for
break starting next week so it might be best to sync up early next year.
Would week of the 6th work for you? 8am PT/10am CT/4pm GMT any day should
work!

> We started by having the relational database be a simple persistent
> > storage layer, when coupled with an index to retrieve data by position,
> > can allow us to scroll through large datasets of billions of rows at
> > ease. We developed a new positional index to handle insertions and
> > deletions in O(log(n)) -- https://arxiv.org/pdf/1708.06712.pdf. I agree
> > that pushing the computation to the relational database does have
> > overheads; but at the same time, it allows for scaling to arbitrarily
> > large datasets.
>
>         Ooh - nice paper. Your crawled data-set looks quite interesting
> too, we
> run wide-scale crash-testing on the LibreOffice code-base across ~100k
> files and enlarging our corpus there: or better, getting some
> statistical view of which OOXML attributes (and thus features) are most
> used out there would be extremely useful to us as we develop the core.
>
>         I like the data on spreadsheet and formula shape - that is very
> useful.
> Do you have data on the geometry of formulae - as in rows vs. columns ?
> [ we switched to columnar storage based mostly on experience rather than
> hard data ;-].
>
>         It is also interesting to have access to very large (1.3m row)
> data-sets that can have useful analysis done on them - would love to see
> the source data there.
>

Again, this is something that we'd be happy to share; this might just take
a bit more work since it's an older codebase.
I believe we did use the geometry of the formulae to determine the best
storage representation, so it's there somewhere :-)

        Sounds good, cf. above - if we can't make that - early in the new
> year
> would be great.
>
>         I look forward to talking,
>

Likewise!

Aditya
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/libreoffice/attachments/20191213/095a385a/attachment.htm>


More information about the LibreOffice mailing list