You can easily minimize the ICU footprint by tweaking the ICU build options. See <a href="http://userguide.icu-project.org/packaging">http://userguide.icu-project.org/packaging</a><div><a href="http://userguide.icu-project.org/packaging"></a>The result will be like this - <a href="http://site.icu-project.org/charts/icu4c-footprint">http://site.icu-project.org/charts/icu4c-footprint</a></div>
<div><a href="http://site.icu-project.org/charts/icu4c-footprint"></a><br><br><div class="gmail_quote">On Fri, Jan 14, 2011 at 7:27 PM, Michael Meeks <span dir="ltr"><<a href="mailto:michael.meeks@novell.com">michael.meeks@novell.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">Hi there,<br>
<br>
On Fri, 2011-01-07 at 12:22 -0600, Norbert Thiebaud wrote:<br>
> > Which makes me wonder: do we really need everything that is in that<br>
> > beast ? pmap seems to suggest we use 84K out of the 13Mb on Linux:<br>
><br>
> Michael: icudata contains, among other things all the supported<br>
> utf16<->other-codepage convertion. If your locale is utf8 or iso8859-1/15<br>
> (which is most likely in your case) then sure you just need one or two<br>
> of these conversion table... if any at all (some convertion like<br>
> utf8<->utf16 are algorithmic)<br>
<br>
Sure. We have a patch for sal:<br>
<br>
<a href="http://cgit.freedesktop.org/libreoffice/build/tree/patches/dev300/size-sal-textenc.diff" target="_blank">http://cgit.freedesktop.org/libreoffice/build/tree/patches/dev300/size-sal-textenc.diff</a><br>
<br>
sadly still not merged, since it needs re-testing on win32 - that chops<br>
a megabyte of this off of sal (exactly the same text encoding conversion<br>
tables).<br>
<br>
> libicudata also contains stuff about collation and locales...<br>
<br>
Right - but it also seems that some (much?) of this data is not<br>
actually used :-) AFAICS we don't use the charset conversion data at<br>
all, preferring the sal stuff. There are whole fields of API that are<br>
simply not touched from ICU:<br>
<br>
'ucnv_' (char set conversion !?)<br>
'ures_'<br>
'unorm_'<br>
'utrans_'<br>
'u_shapeArabic'<br>
<br>
So - I suspect we could hack some big chunks of code, and data out of<br>
this: the data is the biggest evil size-wise from a distribution<br>
perspective I suspect: 5.5Mb compressed of our win32 download is data we<br>
don't use [ one of the bigger lumps of pointlessness there ].<br>
<br>
> either way libicudata is big, but there is not that much redundancy in<br>
> it. it just covert and insanely large number of code page (<br>
<br>
sure sure :-) and we don't need that AFAICS, since we don't use the<br>
relevant APIs; and our internal ICU does not have to be a generic useful<br>
resource for abstract programs (particularly on Win32).<br>
<br>
So I added an easy hack here:<br>
<br>
<a href="http://wiki.documentfoundation.org/Development/Easy_Hacks#de-bloat_internal_ICU" target="_blank">http://wiki.documentfoundation.org/Development/Easy_Hacks#de-bloat_internal_ICU</a><br>
<br>
Thanks,<br>
<br>
Michael.<br>
<font color="#888888"><br>
--<br>
<a href="mailto:michael.meeks@novell.com">michael.meeks@novell.com</a> <><, Pseudo Engineer, itinerant idiot<br>
<br>
<br>
_______________________________________________<br>
LibreOffice mailing list<br>
<a href="mailto:LibreOffice@lists.freedesktop.org">LibreOffice@lists.freedesktop.org</a><br>
<a href="http://lists.freedesktop.org/mailman/listinfo/libreoffice" target="_blank">http://lists.freedesktop.org/mailman/listinfo/libreoffice</a><br>
</font></blockquote></div><br><br clear="all"><br>-- <br>_/|\_ Samphan Raruenrom. Open Source Development Co., Ltd.<br>Tel: +66 38 311816, Fax: +66 38 773128, <a href="http://www.osdev.co.th/">http://www.osdev.co.th/</a><br>
</div>