<html> <head> <base href="https://bugs.documentfoundation.org/"> </head> <body><table border="1" cellspacing="0" cellpadding="8"> <tr> <th>Bug ID</th> <td><a class="bz_bug_link bz_status_UNCONFIRMED " title="UNCONFIRMED - mdds accelerating random lookups" href="https://bugs.documentfoundation.org/show_bug.cgi?id=109097">109097</a> </td> </tr> <tr> <th>Summary</th> <td>mdds accelerating random lookups </td> </tr> <tr> <th>Product</th> <td>LibreOffice </td> </tr> <tr> <th>Version</th> <td>5.3.3.1 rc </td> </tr> <tr> <th>Hardware</th> <td>All </td> </tr> <tr> <th>OS</th> <td>All </td> </tr> <tr> <th>Status</th> <td>UNCONFIRMED </td> </tr> <tr> <th>Severity</th> <td>enhancement </td> </tr> <tr> <th>Priority</th> <td>medium </td> </tr> <tr> <th>Component</th> <td>Calc </td> </tr> <tr> <th>Assignee</th> <td>libreoffice-bugs@lists.freedesktop.org </td> </tr> <tr> <th>Reporter</th> <td>michael.meeks@collabora.com </td> </tr></table> <p> <div> <pre>mdds provides an excellent, compact and high-performance storage for lots of cases, however we have a few files whereby the performance of lookups in mdds could be improved. These are for calculation of formulae in sheets that have rather sparse data in columns - containing many blocks of blanks particularly interspersed with data. I understand that the calc team have spent a lot of time fixing the root causes of many of these issues by using better iterators down columns and so on, but for this specific case - having better "random access" performance could be extremely helpful. ScColumn::GetCellValue is the majority of this cost - called ultimately from ScInterpreter. I wonder if we could prototype this inside ScColumn::GetCellValue in fact (?) thoughts appreciated. Hopefully this would lead to fixing the main random access problem, while keeping examples of poor iteration visible in the profiles. Code pointers: workdir/UnpackedTarball/mdds/include/mdds/multi_type_vector.hpp sc/inc/mtvelements.hxx typedef mdds::multi_type_vector<CellFunc, CellStoreEvent> CellStoreType; typedef mdds::multi_type_vector<CellFunc, CellStoreEvent> CellStoreType; CellStoreType::const_iterator itBlk = rSrc.begin(), itBlkEnd = rSrc.end(); The problem here being that: struct block { size_type m_size; element_block_type* mp_data; } So each block has no idea of its absolute position - which makes a lot of sense for lots of cases such as mutating the sheet - inserting rows etc. otherwise we could store absolute position, and size as a difference to the next/terminating block. >From an optimizing trade-off perspective, optimizing for inserts in the middle of sheets at the expense of individual cell lookups, or file-filters appending data is probably not deliberate. Of course - failing this, during very 'random' access patterns such as calculation; we could carry around some sort of lookup / caches of common columns we like to get data from. Aron - any chance you can add some SAL_DEBUG() lines to that pivot sheet to print out all the arguments to: ScColumn::GetCellValue - including the column's members, nCol, nTab and the nRow passed in ? hopefully that will give us a big log to see if there is indeed any patterns in at least this case. And of course, probably this is a duplicate ticket =) Thoughts ?</pre> </div> </p> <hr> <span>You are receiving this mail because:</span> <ul> <li>You are the assignee for the bug.</li> </ul> </body> </html>