<html>
<head>
<base href="https://bugs.documentfoundation.org/">
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_UNCONFIRMED "
title="UNCONFIRMED - mdds accelerating random lookups"
href="https://bugs.documentfoundation.org/show_bug.cgi?id=109097">109097</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>mdds accelerating random lookups
</td>
</tr>
<tr>
<th>Product</th>
<td>LibreOffice
</td>
</tr>
<tr>
<th>Version</th>
<td>5.3.3.1 rc
</td>
</tr>
<tr>
<th>Hardware</th>
<td>All
</td>
</tr>
<tr>
<th>OS</th>
<td>All
</td>
</tr>
<tr>
<th>Status</th>
<td>UNCONFIRMED
</td>
</tr>
<tr>
<th>Severity</th>
<td>enhancement
</td>
</tr>
<tr>
<th>Priority</th>
<td>medium
</td>
</tr>
<tr>
<th>Component</th>
<td>Calc
</td>
</tr>
<tr>
<th>Assignee</th>
<td>libreoffice-bugs@lists.freedesktop.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>michael.meeks@collabora.com
</td>
</tr></table>
<p>
<div>
<pre>mdds provides an excellent, compact and high-performance storage for
lots of cases, however we have a few files whereby the performance of
lookups in mdds could be improved. These are for calculation of
formulae in sheets that have rather sparse data in columns -
containing many blocks of blanks particularly interspersed with data.
I understand that the calc team have spent a lot of time fixing the
root causes of many of these issues by using better iterators down
columns and so on, but for this specific case - having better "random
access" performance could be extremely helpful.
ScColumn::GetCellValue
is the majority of this cost - called ultimately from ScInterpreter.
I wonder if we could prototype this inside ScColumn::GetCellValue in
fact (?) thoughts appreciated. Hopefully this would lead to fixing the
main random access problem, while keeping examples of poor iteration
visible in the profiles.
Code pointers:
workdir/UnpackedTarball/mdds/include/mdds/multi_type_vector.hpp
sc/inc/mtvelements.hxx
typedef mdds::multi_type_vector<CellFunc, CellStoreEvent> CellStoreType;
typedef mdds::multi_type_vector<CellFunc, CellStoreEvent> CellStoreType;
CellStoreType::const_iterator itBlk = rSrc.begin(), itBlkEnd = rSrc.end();
The problem here being that:
struct block
{
size_type m_size;
element_block_type* mp_data;
}
So each block has no idea of its absolute position - which makes a lot
of sense for lots of cases such as mutating the sheet - inserting rows
etc. otherwise we could store absolute position, and size as a
difference to the next/terminating block.
>From an optimizing trade-off perspective, optimizing for inserts in
the middle of sheets at the expense of individual cell lookups, or
file-filters appending data is probably not deliberate.
Of course - failing this, during very 'random' access patterns such as
calculation; we could carry around some sort of lookup / caches of
common columns we like to get data from.
Aron - any chance you can add some SAL_DEBUG() lines to that pivot
sheet to print out all the arguments to: ScColumn::GetCellValue -
including the column's members, nCol, nTab and the nRow passed in ?
hopefully that will give us a big log to see if there is indeed any
patterns in at least this case.
And of course, probably this is a duplicate ticket =) Thoughts ?</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are the assignee for the bug.</li>
</ul>
</body>
</html>