[Libreoffice-bugs] [Bug 125110] CalcSpreadsheet: issues converting .CSV where there are more than 30K rows of data
bugzilla-daemon at bugs.documentfoundation.org
bugzilla-daemon at bugs.documentfoundation.org
Mon Aug 16 17:18:48 UTC 2021
https://bugs.documentfoundation.org/show_bug.cgi?id=125110
Eike Rathke <erack at redhat.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
OS|Windows (All) |All
Hardware|x86-64 (AMD64) |All
--- Comment #13 from Eike Rathke <erack at redhat.com> ---
(In reply to Mike Kaganski from comment #7)
> So I suppose that what should have been done here is:
> 1. Seeing the opening double quote in the beginning of the field, start
> "quote-enclosed field" mode.
> 2. If it encounters something *invalid* for such a mode, it should re-read
> the field again, this time without the "quote-enclosed field" mode (to
> properly re-consume possible field separators that could had been read in
> the first pass as the quoted field content).
>
> This way, this sample would be read properly, without introducing any
> ambiguity.
It would fail in other constellations that now are handled well, like
"abc "def" ghi, jkl"
where
|abc "def" ghi, jkl|
is supposed to be *one* field content because the generator didn't escape
quotes by doubling them. Your approach would result in
|"abc "def" ghi| jkl"|
Whatever we'll do, it will make things fail differently for other data of
broken generators. You could throw more logic at it like thinking in "words" to
be ignored re-triggering quotes have to have a space left (opening quote) or
right (closing quote), which would fail for data that simply doesn't follow
that assumption. Things get even worse if space was a field separator.
Take a look at what is done with the field start mode and quote state to fix
known broken data cases and bug 48621 for test case sample files and related.
I tend to close this for a too broken generator, but if you can come up with
some loose magic that doesn't break any of the already handled cases, then
fine..
--
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/libreoffice-bugs/attachments/20210816/ce3c6687/attachment.htm>
More information about the Libreoffice-bugs
mailing list