<html> <head> <base href="https://bugs.documentfoundation.org/"> </head> <body><span class="vcard"><a class="email" href="mailto:serval2412@yahoo.fr" title="Julien Nabet <serval2412@yahoo.fr>"> <span class="fn">Julien Nabet</span></a> </span> changed <a class="bz_bug_link bz_status_RESOLVED bz_closed" title="RESOLVED INVALID - Data in Visual FoxPro DBF is garbled" href="https://bugs.documentfoundation.org/show_bug.cgi?id=69744">bug 69744</a> <br> <table border="1" cellspacing="0" cellpadding="8"> <tr> <th>What</th> <th>Removed</th> <th>Added</th> </tr> <tr> <td style="text-align:right;">CC</td> <td> </td> <td>serval2412@yahoo.fr </td> </tr></table> <p> <div> <b><a class="bz_bug_link bz_status_RESOLVED bz_closed" title="RESOLVED INVALID - Data in Visual FoxPro DBF is garbled" href="https://bugs.documentfoundation.org/show_bug.cgi?id=69744#c11">Comment # 11</a> on <a class="bz_bug_link bz_status_RESOLVED bz_closed" title="RESOLVED INVALID - Data in Visual FoxPro DBF is garbled" href="https://bugs.documentfoundation.org/show_bug.cgi?id=69744">bug 69744</a> from <span class="vcard"><a class="email" href="mailto:serval2412@yahoo.fr" title="Julien Nabet <serval2412@yahoo.fr>"> <span class="fn">Julien Nabet</span></a> </span></b> <pre>Following recent dBase commits (see <a href="https://cgit.freedesktop.org/libreoffice/core/log/?qt=grep&q=dbase">https://cgit.freedesktop.org/libreoffice/core/log/?qt=grep&q=dbase</a>), the dbf files open with RTL_TEXTENCODING_IBM_866 (Russian MS-DOS code page 866) hexdump of the file shows this: 0000000 0d30 1809 0001 0000 0148 0051 0000 0000 0000010 0000 0000 0000 0000 0000 0000 6500 0000 0000020 808d 8287 8d80 8588 0000 4300 0001 0000 0000030 0050 0004 0000 0000 0000 0000 0000 0000 0000040 000d 0000 0000 0000 0000 0000 0000 0000 0000050 0000 0000 0000 0000 0000 0000 0000 0000 * 0000140 0000 0000 0000 0000 d020 f1f3 eaf1 e9e8 0000150 f220 eae5 f2f1 2020 2020 2020 2020 2020 0000160 2020 2020 2020 2020 2020 2020 2020 2020 * 0000190 2020 2020 2020 2020 1a20 000019a Let's read it in little-endian way, so first byte is 30 not 0d. 30 is version and corresponds here to VisualFoxPro file (see <a href="http://opengrok.libreoffice.org/xref/core/connectivity/source/inc/dbase/DTable.hxx#40">http://opengrok.libreoffice.org/xref/core/connectivity/source/inc/dbase/DTable.hxx#40</a>) 65 (in second line) indicates RTL_TEXTENCODING_IBM_866 Third line gives field name, its fieldtype and 50 from beginning "50" from line gives indicates length field (80 in decimal). But then lines 7 and 8 give content of the record but nothing about encoding. So I don't know how LO could "guess" the encoding of the context except by testing range value of charsets, eg: d0 in <a href="https://www.ascii-codes.com/cp866.html">https://www.ascii-codes.com/cp866.html</a> gives "Box drawings up double and horizontal single" d0 in <a href="http://www.iana.org/assignments/charset-reg/PTCP154">http://www.iana.org/assignments/charset-reg/PTCP154</a> gives "CYRILLIC CAPITAL LETTER ER" But even with this, a user could want some non cyrillic characters (bow drawings) in content and the guessing would be wrong. BTW, would be interested in dbf original with different versions (DB2, DB3, DB4... with memo, with sql, ...FoxPro, etc.) and encodings.</pre> </div> </p> <hr> <span>You are receiving this mail because:</span> <ul> <li>You are the assignee for the bug.</li> </ul> </body> </html>