[Libreoffice-bugs] [Bug 131951] Quadratic time on reading and converting html files with images

bugzilla-daemon at bugs.documentfoundation.org bugzilla-daemon at bugs.documentfoundation.org
Sun Apr 12 14:49:19 UTC 2020


https://bugs.documentfoundation.org/show_bug.cgi?id=131951

--- Comment #8 from Pavel <klev.paul at gmail.com> ---
HTML::ScanText (svtools/source/svhtml/parthhtml) reads html token data up to
MAX_LEN (=1024) symbols to temp buffer and then do concatenation (+=) of
strings.
This causes allocation of memory and copying existing data and new data
(memcpy)
And because number of chunks is substantial, copying of almost the same data is
repeated multiple times

Possible solution could be increase buffer size each time it is filled (1024,
2048, 4096...)

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/libreoffice-bugs/attachments/20200412/9b6ded50/attachment.htm>


More information about the Libreoffice-bugs mailing list