[FriBidi] Invalid UTF-8 for Arabic
Yoann Roman
yroman at altalang.com
Fri Mar 6 14:55:48 PST 2009
Behdad Esfahbod wrote:
>> I'm using the trunk fribidi2 code, compiled with VS2003 on Windows
>> XP, from Python with the Pyfribidi extension, also compiled on VS
>> 2003.
>
> The first step to debug this is to make sure PyFriBidi is not the
> culprit. That is, can you reproduce the bug using C? If yes, please
> send the code here.
I'm no C expert, so I took a slightly different approach to pull
Pyfribidi out of the equation. I compiled fribidi.exe and used a Hex
editor to check its output. Looks like the lost final byte may be a
Pyfribidi problem. This test did bring up another bug, though.
Attached is a zip with:
- arabic.input: the Arabic string straight out of Python. This will
show up correctly in anything with bidi support (e.g., Notepad on
Windows XP with Arabic support installed). There is no BOM.
- arabic.output: output from running bin\fribidi.exe --nopad
arabic.input. No Python involved here.
- arabic-correct.png: a correct Word visual representation
- arabic-incorrect.png: what I get using arabic.output
If you open arabic.output in a Hex editor, you'll see that bytes 5
through 7 contain the UTF-8 BOM sequence. It looks like no characters
are missing, though.
Is this enough info to track this new issue down?
Thanks,
--
Yoann Roman
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Arabic.zip
Type: application/x-zip-compressed
Size: 11060 bytes
Desc: not available
Url : http://lists.freedesktop.org/archives/fribidi/attachments/20090306/8f0d3e00/attachment.bin
More information about the fribidi
mailing list