[Libreoffice-bugs] [Bug 36313] New: UTF-8 encoding problen when converting in headless mode

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Sat Apr 16 16:05:42 PDT 2011


https://bugs.freedesktop.org/show_bug.cgi?id=36313

           Summary: UTF-8 encoding problen when converting in headless
                    mode
           Product: LibreOffice
           Version: LibO 3.3.2 release
          Platform: Other
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: medium
         Component: Libreoffice
        AssignedTo: libreoffice-bugs at lists.freedesktop.org
        ReportedBy: vazoukos at gmail.com


Users can convert documents in headless mode. For example, you can
convert a CSV file to an ODS file, using the command line.
The problem is that LibreOffice assumes, by default, that the initial
encoding is ISO-8859-1,
and there is no option yet to change this. Documents with other
encodings have the text corrupted.

HOW TO REPLICATE:
a. Create the following test.csv file:

$ cat /tmp/mytest.csv
"First","Second"
"áéŕó","ṫřåiṅ"
$ _

b. Then convert with the following command line:

$ libreoffice -headless -convert-to ods mytest.csv
onvert /tmp/mytest.csv -> /tmp/mytest.ods using OpenDocument
Spreadsheet Flat XML
Warning: at xsl:stylesheet on line 2 of
file:///usr/lib/libreoffice/basis3.3/share/xslt/odfflatxml/odfflatxmlexport.xsl:
Running an XSLT 1.0 stylesheet with an XSLT 2.0 processor
$ _

c. Finally inspect the generated mytest.ods:

<table:table-cell office:value-type="string">
<text:p>áéŕó</text:p>
</table:table-cell>
<table:table-cell office:value-type="string">
<text:p>ṫřåiṅ</text:p>

The text shows that a conversion from ISO-8859-1 to UTF-8 was forced,
which corrupts the text.

WHAT SHOULD HAPPEN:
» There should be an option to specify the initial encoding
so that no forced conversion takes place.

RELEVANT LINKS:
http://listarchives.libreoffice.org/www/users/msg03444.html

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.


More information about the Libreoffice-bugs mailing list