[FriBidi] Invalid UTF-8 for Arabic
Yoann Roman
yroman-fribidi at altalang.com
Mon Mar 9 10:33:59 PDT 2009
> I haven't used fribidi on windows, but I just run the fribid
> executable in linux on your input. The result differs a lot:
>
> $ hexdump -C arabic.output-unix
> 00000000 d8 9f d8 a7 d9 84 20 d8 a7 d8 b0 d8 a7 d9 85 d9
> 00000010 84 20 d9 88 d8 a3 20 d8 a7 d8 b0 d8 a7 d9 85 d9
> 00000020 84 20 d8 9f d9 85 d9 88 d9 8a 20 d9 84 d9 83 20
> 00000030 d8 b1 d9 88 d8 b7 d9 81 d9 84 d8 a7 20 d9 84 d9
> 00000040 88 d8 a7 d9 86 d8 aa d8 aa 20 d9 84 d9 87
> 0000004e
>
> compared to your output:
>
> $ hexdump -C arabic.output
> 00000000 d8 9f ef bb bb ef bb bf 20 ef ba 8d ef ba ab ef
> 00000010 ba 8e ef bb a4 ef bb 9f 20 ef bb ad ef ba 83 20
> 00000020 ef ba 8d ef ba ab ef ba 8e ef bb a4 ef bb 9f 20
> 00000030 d8 9f ef bb a1 ef bb ae ef bb b3 20 ef bb 9e ef
> 00000040 bb 9b 20 ef ba ad ef bb ae ef bb 84 ef bb 94 ef
> 00000050 bb 9f ef ba 8d 20 ef bb 9d ef bb ad ef ba 8e ef
> 00000060 bb a8 ef ba 98 ef ba 97 20 ef bb 9e ef bb ab
> 0000006f
>
> I'm almost sure the unix output has no wrong utf-8 sequence, but the
> windows output seems so wrong.
>
> [snip]
Behdad,
Thanks for the response.
Your output is what I get from fribidi 0.10.9, but that doesn't do
Arabic joining. Other than that BOM mark, the Windows output from
0.19.1 is correct and matches what other non-Fribidi-based, bidi
programs do.
--
Yoann Roman
More information about the fribidi
mailing list