libebook filter sniffing cost ...
Michael Meeks
michael.meeks at collabora.com
Thu Dec 12 02:02:57 PST 2013
Hi David & Fridrich,
Just doing some load time profiling, and I notice that the libebook
filter chews just under 3% of the load-time of (quite a large) XLSX
file ;-)
It seems the filter / sniffing / detection code there is particularly
problematic. I wonder if we need something like this:
git log -u -1 53138c9968e28a25a8cd6d2b5e3d31cbb3257852
To avoid thrashing the XStream read function ? we do 52k 'read' calls
on the XStream which is really not a fast interface to use for small
reads.
http://people.freedesktop.org/~michael/sheet-profile.txt
Has the profile there; compare EBookImportFilter::detect to
framework::LoadEnv::startLoading.
For thumbnailing we had a similar problem with reading strings improved
but not fixed by:
commit d67cd21033877c9c09d9cc4f14c2c4658e973f57
Author: Mathieu Parent <mathieu.parent at nantesmetropole.fr>
Date: Mon Oct 14 22:23:05 2013 +0100
fdo#56007 - Read more bytes on Zip read (for thumbnails)
Particularly on remote file-systems we'd do many remote calls here -
which is really not ideal.
I've pushed a small patch to avoid some of the more silly reallocing
calling of:
template< class E >
inline void Sequence< E >::realloc( sal_Int32 nSize )
{
const Type & rType = ::cppu::getTypeFavourUnsigned( this );
sal_Bool success =
::uno_type_sequence_realloc(
&_pSequence, rType.getTypeLibType(), nSize,
(uno_AcquireFunc)cpp_acquire, (uno_ReleaseFunc)cpp_release );
if (!success)
throw ::std::bad_alloc();
}
Un-conditionally even when the sequence is the same length seems
particularly silly ;-) [ I assume that the WPXSvInputStream by keeping
the sequence around should save that allocation & be quite efficient
through a blizzard of identical sized reads anyhow ;-].
It makes me wonder whether the above should have a fast-past for
pointless reallocs to the same size though.
Thoughts appreciated though; is there some ordering of sniffing such
that we can prioritize common formats over less common ones ? and has
perhaps libebook got into that stack too high up ?
ATB,
Michael.
--
michael.meeks at collabora.com <><, Pseudo Engineer, itinerant idiot
More information about the LibreOffice
mailing list