Benchmark results on mdds::multi_type_vector

Fri Dec 13 22:38:01 UTC 2019

On 13.12.2019 05:43, Luboš Luňák wrote:
> On Friday 13 of December 2019, Kohei Yoshida wrote:
>> I just finished my benchmark testing on mdds::multi_type_vector, and
>> summarized my results in this blog post:
>> 
>> http://kohei.us/2019/12/12/benchmark-results-on-mdds-multi_type_vector/
>> 
>> Hopefully my findings and intepretations make sense.  In short, the
>> numbers look great.  The overhead of block shifting is a concern, but
>> I'm optimistic that this is going to be a non-issue for the most part.
> 
>  I'd really like to see benchmarks of Calc with this new mdds, 
> especially to
> see how many regressions there will be, as I'm concerned whether it 
> really
> would be worth it in reality.

Sure, I do share your concern, which is why I spent time designing and 
implementing the benchmark I did so that I can get some answers for my 
concern.

  You say that the vast majority of Calc
> performance problems are with updating cell values without shifting, 
> but that
> makes sense because that's where the current bottleneck is. Once the
> bottleneck moves to shifting of cells, we may get a whole new slew of
> bugreports about that.

Sure, but that's just as much of a speculation as my own interpretation. 
  To be fair, it is possible that you are right, and I am wrong.  But I 
did provide my own interpretations of those numbers based on my own 
experience and educated guesses.  I'm not claiming that I'm right, but 
I'm claiming that what I concluded in my post is my truly honest, 
hopefully reasonably researched opinions.

E.g. copy&paste of a column is very likely to hit a
> problem there, IIRC it internally results in a lot of shifting of 
> cells.

Yes, which is why I ran the benchmarks to get some numbers to get more 
clarity.

> 
>  One interpretation of the graphs may be that the change helps a lot at 
> the
> cost of a regression in one place, but other possible interpretation is 
> that
> the change brings an improvement that can already be mostly achieved 
> using
> hints at the expense of a cost that cannot be alleviated. Moreover we 
> did go
> over all the reported performance problems related to mdds some months 
> back
> and fixed all of them (at least I'm not aware of any pending ones). So 
> the
> real question for me is how many of real-world cases will be improved 
> and
> worsened by this, which is why I'd like to see non-artifical 
> benchmarks.

So, I'm a bit concerned about your use of the word "artificial" to 
describe my benchmark, because that word implies that I somehow made 
those numbers up.  Those are real numbers.  Now, the numbers will of 
course be quite different if you measure the entire Calc operations 
which include a whole bunch of other operations, and I believe this is 
what you are alluding to.  I do share your concern there.  But I thought 
it was reasonable to draw the conclusions that I did, given that the I/O 
with mdds::multi_type_vector do constitute a large part of Calc's cell 
I/O's.  Also, keep in mind that the rest of the Calc operations are 
constant, and the only variable is the mdds portion.  On this point, I 
believe it's not unreasonable to draw *some* conclusions based on the 
numbers on mdds alone.

Having said that, you are of course free to draw your own, different 
conclusions.

>  BTW, I have you considered using vector operations like SSE for the 
> updates
> (either checking whether the compiler can employ them automatically or
> hand-writing them)?

Yes.  For one, I did look into e.g. OpenMP's auto SIMD support.  But its 
support appeared to be very limited, and MSVC did not seem to support 
it.  I also thought about hand-writing SIMD directly, and I am still 
considering that as one of my future possibilities (note that I'm not 
entirely done with this work).  But I couldn't think of a good one to 
use, especially when multi_type_vector uses array of structures (AoS).  
SIMD intrinsics I know of are mostly not suitable for AoS.  If you know 
of good SIMD instinsics that may work for multi_type_vector, I would be 
interested.

I've done some SIMD coding in orcus to speed up XML and JSON parsing, 
but I can't say I'm expert at it, and I did not always manage to get the 
code to run faster with SIMD.

Alright, since now one person is raising objection on hastily 
integrating this piece, I should hold on to integrating this piece for 
now, and let the discussion continue.

And, while I would love to craft another benchmark test involving the 
entire Calc piece, I'm afraid I won't have enough bandwidth to do that.  
Even running this benchmark on mdds alone took me one month to do it 
end-to-end.  It would be nice to have someone else chip in and conduct 
another, more through and satisfactory benchmark test, if anybody is 
interested.

Thanks,

Kohei

-- 
Kohei Yoshida, LibreOffice Calc volunteer hacker