[FriBidi] log2vis() misbehaving with Arabic text?
Behdad Esfahbod
behdad at behdad.org
Thu Oct 23 13:28:30 PDT 2014
On 14-10-23 12:35 PM, Philip Semanchuk wrote:
> On Tue, Oct 21, 2014 at 6:40 PM, Behdad Esfahbod <behdad at behdad.org
> <mailto:behdad at behdad.org>> wrote:
>
> Hi Philip,
>
> Comments below.
>
> On 14-10-21 11:42 AM, Philip Semanchuk wrote:
> > log2vis() puts the Shadda in a different place than the BAR
> > (Better-Arabic-Reshaper):
> > log2vis: u’\ufe94\ufef4\u0651\ufe91\ufeab\ufe8e\ufe9f'
> > bar: u’\ufe94\u0651\ufef4\ufe91\ufeab\ufe8e\ufe9f'
> >
>
> Which one is correct depends on how you are going to use the results. The
> rule in question is written down here:
>
> http://www.unicode.org/reports/tr9/#L3
>
> If you want, for example, to output this sequence to a non-bidi-aware
> terminal, then the result that FriBidi is creating is correct and the BAR is
> incorrect.
>
> Looking at the BAR code, I'm much more confident in FriBidi being correct than
> in BAR.
>
>
> Hi Behdad,
> Thanks very much for the informative reply. I learn something new every day,
> including the fact that I have a lot to learn.
>
> I feel sure I should have a followup question but I need to experiment some
> more before I can ask it.
>
> I agree with your confidence in FriBidi over BAR. The latter is great for what
> it is, but I’m sure FriBidi sees more use and review.
Thanks Philip,
I also ported FriBidi's arabic shaping to Python a while ago, if that's more
convenient:
https://github.com/behdad/pyarabicshaping
I think it might be in use somewhere in a pipeline that gets Google Earth
images to Arabic-speaking news channels still...
--
behdad
http://behdad.org/
More information about the fribidi
mailing list