a few more comments ...
Michael Meeks
michael.meeks at collabora.com
Mon Jun 5 18:40:29 UTC 2017
Hi Mohammed,
Just a few more esthetic comments =) on a nice patch.
+ size_t mnPos;
+ size_t mnStreamSize;
+
+ Buffer maInUseBuffer;
+ int mnOffset;
I'd love to see some doxygen comments:
size_t mnPos; /// position in stream
int mnOffset; /// position in maInUseBuffer
etc. added to this header =) makes it easier to read.
Also - since we wrote all this code ourselves - we don't need to use
the ALv2 variant header - lets just use the MPLv2 header from
TEMPLATE.SOURCECODE.HEADER in the top-level for the two new files.
Now I read it again, I'm rather skeptical that:
+ if( !maUsedBuffers.empty() )
+ {
+ pProducedBuffer = maUsedBuffers.front();
+ maUsedBuffers.pop();
+ }
is helpful. I'm not sure that we can re-use these buffers efficiently
here - perhaps its better just to allocate new ones in the stream read
method; can you profile with and without that (would love to get rid of
the extra complexity if possible).
+class UnzippingThread: public salhelper::Thread
+{
+ XBufferedThreadedStream *mxStream;
I'd use a reference - to clarify lifecycle: so "&mrStream" - but of
course, with std::thread we could use a lambda for this thing I guess
and capture it ...
Gotcha - I think I found the bad bit which is this:
commit 4ae705d02df0ddf75b97d0e94add6994626f487e
Author: Kohei Yoshida <kohei.yoshida at collabora.com>
Date: Fri Jan 13 20:47:46 2017 -0500
tdf#97597: Ensure that each parsing thread has its own buffer.
The package/ code is (I think) thread-safe, but is not safe wrt. the
file-pointer underlying itself. I -think- it believes that only one
thread is read at a time.
Kohei's change here - I think is to make each stream fully de-compress
itself into memory before parsing it - which fixed some horrible calc
problem.
Hmm - so we're going to need to fix the root cause here I think, not
least because de-compressing the whole stream can be really very
expensive: ODF eg. is -incredibly- verbose with the XML taking up far
more memory than the equivalent calc data structures.
Sooo ... I think your code is golden; but we're going to need to fix
the package/ code to avoid the bug.
https://bugs.documentfoundation.org/show_bug.cgi?id=97597#c28
Suggests (input from Kohei appreciated) - that we may need to do a new
'seek' each time we come to reading some bytes from the underlying
package/ stream - to ensure that no other thread is messing with our
underlying stream position while we are reading compressed data from it =)
I guess we should also write a unit test for this - that is heavily
threaded - 4-8 threads doing back-to-back reads for quite some time
might prolly catch this.
So - could you look into the package/ impl ?
Thanks !
Michael.
--
michael.meeks at collabora.com <><, Pseudo Engineer, itinerant idiot
More information about the LibreOffice
mailing list