[Libreoffice-bugs] [Bug 109097] New: mdds accelerating random lookups
bugzilla-daemon at bugs.documentfoundation.org
bugzilla-daemon at bugs.documentfoundation.org
Thu Jul 13 11:25:37 UTC 2017
https://bugs.documentfoundation.org/show_bug.cgi?id=109097
Bug ID: 109097
Summary: mdds accelerating random lookups
Product: LibreOffice
Version: 5.3.3.1 rc
Hardware: All
OS: All
Status: UNCONFIRMED
Severity: enhancement
Priority: medium
Component: Calc
Assignee: libreoffice-bugs at lists.freedesktop.org
Reporter: michael.meeks at collabora.com
mdds provides an excellent, compact and high-performance storage for
lots of cases, however we have a few files whereby the performance of
lookups in mdds could be improved. These are for calculation of
formulae in sheets that have rather sparse data in columns -
containing many blocks of blanks particularly interspersed with data.
I understand that the calc team have spent a lot of time fixing the
root causes of many of these issues by using better iterators down
columns and so on, but for this specific case - having better "random
access" performance could be extremely helpful.
ScColumn::GetCellValue
is the majority of this cost - called ultimately from ScInterpreter.
I wonder if we could prototype this inside ScColumn::GetCellValue in
fact (?) thoughts appreciated. Hopefully this would lead to fixing the
main random access problem, while keeping examples of poor iteration
visible in the profiles.
Code pointers:
workdir/UnpackedTarball/mdds/include/mdds/multi_type_vector.hpp
sc/inc/mtvelements.hxx
typedef mdds::multi_type_vector<CellFunc, CellStoreEvent> CellStoreType;
typedef mdds::multi_type_vector<CellFunc, CellStoreEvent> CellStoreType;
CellStoreType::const_iterator itBlk = rSrc.begin(), itBlkEnd = rSrc.end();
The problem here being that:
struct block
{
size_type m_size;
element_block_type* mp_data;
}
So each block has no idea of its absolute position - which makes a lot
of sense for lots of cases such as mutating the sheet - inserting rows
etc. otherwise we could store absolute position, and size as a
difference to the next/terminating block.
>From an optimizing trade-off perspective, optimizing for inserts in
the middle of sheets at the expense of individual cell lookups, or
file-filters appending data is probably not deliberate.
Of course - failing this, during very 'random' access patterns such as
calculation; we could carry around some sort of lookup / caches of
common columns we like to get data from.
Aron - any chance you can add some SAL_DEBUG() lines to that pivot
sheet to print out all the arguments to: ScColumn::GetCellValue -
including the column's members, nCol, nTab and the nRow passed in ?
hopefully that will give us a big log to see if there is indeed any
patterns in at least this case.
And of course, probably this is a duplicate ticket =) Thoughts ?
--
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/libreoffice-bugs/attachments/20170713/2bb514f0/attachment.html>
More information about the Libreoffice-bugs
mailing list