[FriBidi] Invalid UTF-8 for Arabic

Yoann Roman yroman-fribidi at altalang.com
Mon Mar 9 10:33:59 PDT 2009


> I haven't used fribidi on windows, but I just run the fribid
> executable in linux on your input.  The result differs a lot:
> 
> $ hexdump -C arabic.output-unix
> 00000000  d8 9f d8 a7 d9 84 20 d8  a7 d8 b0 d8 a7 d9 85 d9
> 00000010  84 20 d9 88 d8 a3 20 d8  a7 d8 b0 d8 a7 d9 85 d9
> 00000020  84 20 d8 9f d9 85 d9 88  d9 8a 20 d9 84 d9 83 20
> 00000030  d8 b1 d9 88 d8 b7 d9 81  d9 84 d8 a7 20 d9 84 d9
> 00000040  88 d8 a7 d9 86 d8 aa d8  aa 20 d9 84 d9 87
> 0000004e
> 
> compared to your output:
> 
> $ hexdump -C arabic.output
> 00000000  d8 9f ef bb bb ef bb bf  20 ef ba 8d ef ba ab ef
> 00000010  ba 8e ef bb a4 ef bb 9f  20 ef bb ad ef ba 83 20
> 00000020  ef ba 8d ef ba ab ef ba  8e ef bb a4 ef bb 9f 20
> 00000030  d8 9f ef bb a1 ef bb ae  ef bb b3 20 ef bb 9e ef
> 00000040  bb 9b 20 ef ba ad ef bb  ae ef bb 84 ef bb 94 ef
> 00000050  bb 9f ef ba 8d 20 ef bb  9d ef bb ad ef ba 8e ef
> 00000060  bb a8 ef ba 98 ef ba 97  20 ef bb 9e ef bb ab
> 0000006f
> 
> I'm almost sure the unix output has no wrong utf-8 sequence, but the
> windows output seems so wrong.
> 
> [snip]

Behdad,

Thanks for the response.

Your output is what I get from fribidi 0.10.9, but that doesn't do 
Arabic joining. Other than that BOM mark, the Windows output from 
0.19.1 is correct and matches what other non-Fribidi-based, bidi 
programs do.

-- 
Yoann Roman



More information about the fribidi mailing list