[FriBidi] log2vis() misbehaving with Arabic text?

Philip Semanchuk osvenskan at gmail.com
Tue Oct 28 10:09:31 PDT 2014


On Tue, Oct 28, 2014 at 4:45 AM, Behdad Esfahbod <behdad at behdad.org> wrote:
> On 14-10-27 10:47 AM, Philip Semanchuk wrote:
>> On Mon, Oct 27, 2014 at 1:26 PM, Behdad Esfahbod <behdad at behdad.org> wrote:
>>> On 14-10-27 08:39 AM, Philip Semanchuk wrote:
>>>> I need to play around with it a little, though. For instance, I saw
>>>> one case where the PDF rendered an unprintable character where
>>>> log2vis() had inserted a ZWNBSP (0xfeff) into a string. Technically a
>>>> ZWNBSP should be harmless but...
>>>
>>> Right.  FriBidi inserts U+FEFF when it needs to delete a character slot.  The
>>> FriBidi user should either remove those from the stream or make sure they
>>> render to nothing.  That sounds like a ReportLab bug.
>>
>> Yes, one could also argue that it's my PDF viewer that's at fault.
>
> Not really.  The PDF viewer gets exact instructions about what to show...
> It's the PDF generator that decides.

I took your advice and tested U+200C in a PDF. Both Acrobat Reader and
my default PDF reader (Preview -- I'm on OS X) render it as a vertical
bar. That a surprise; I thought it would either be invisible or render
as the standard "unprintable character" rectangle.

BTW someone else is having the same problem; here's a PNG:
http://www.princexml.com/forum/post/12999/attachment/strange_vertical_line.png

It's from this conversation:
http://www.princexml.com/forum/topic/2776/solaiman-lipi-font-in-bangla-is-not-being-rendered-properly

>> This is one of the things  I need to experiment with.
>>
>> Removing ZWNBSP is easy enough. Is any other postprocessing needed
>> after calling log2vis()?
>
> Well, there are more characters that need to be hidden.  Check
> fribidi_remove_bidi_marks().  By mistake, that function is deprecated, but I
> don't have a replacement for it if I recall correctly.

I had read the documentation for fribidi_remove_bidi_marks() but I
didn't think it removed U+FEFF. Is this correct pseudo-code?

sentence = fribidi_log2vis(sentence)
sentence = fribidi_remove_bidi_marks(sentence)
sentence = sentence.replace(ZWNBSP, '')

I find the man pages for the fribidi functions helpful, but I can't
find documentation on how to use them together.

Thanks
Philip


More information about the fribidi mailing list