[FriBidi] Using fribidi to switch Arabic chars

Behdad Esfahbod behdad at behdad.org
Thu Jan 17 12:07:45 PST 2008


Hi Ron,

First, CC'ing the fribidi mailing list.  Please consider subscribing and
continuing discussion there.  To your issues:

Yes, a high-level document/HOWTO is one of the things that was keeping
me from releasing the new code.  I've not got to writing that yet.  In
the mean time, there's this diagram that I sent to the list a few years
ago.  Looks to be current to me, but I've not checked the API calls
exactly.  The ideas are quite the same however.  ASCII-art follows:

<blockquote>

  /* analyze */
  fribidi_get_bidi_types
  fribidi_get_par_embedding_levels

  /* shape */
  fribidi_get_joining_types
  fribidi_join_arabic
  fribidi_shape

  break lines

  /* reorder */
  fribidi_reorder_line

=========================

To draw the pipeline:

  in:paragraph_characters[]  in:par_base_dir
        |                       |
,-------+-----------------------|---------------.
|       |                       |               |
|       v                       |               v
|  fribidi_get_bidi_types()     |   fribidi_get_joining_types()
|       |                       |               |
|  bidi_types[]                 |         joining_types[]
|       |                       |               |
|       |                       |               |
|       +-----------------------|-------+-------|-------.
|       |                       |       |       |       |
|       v                       v       |       |       |
|  fribidi_get_par_embedding_levels()   |       |       |
|       |                       |       |       |       |
|       |    out:resolved_par_base_dir  |       |       |
|       |                               |       |       |
|       v                               |       |       |
|  [embedding_levels]                   |       |       |
|       |                               |       |       |
|       +---------------+-------.       |       |       |
|       |               |       |       |       |       |
|       |               |       v       v       v       |
|       |               |      fribidi_join_arabic()    |
|       |               |               |               |
|       |               |        arabic_props[]         |
|       |               |               |               |
'---.   |   ,---------------------------'               |
    |   |   |           |                               |
    v   v   v           |       ,-----------------------'
  fribidi_shape()       |       |
        |               |       |
  glyph_characters[]    |       |
        |               |       |
        |               |       |
        v               |       |
   break_lines()        |       |
        |               |       |
   line_glyphs[][]      |       |
        |               |       |
        |       ,-------'       |
        |       |               |
        |       |       ,-------'
        |       |       |
      (loop over each line)
        |       |       |
        v       v       v
      fribidi_reorder_line()
        |       |
        |       |
        | out:V_to_L_map[][]
        |
out:visual_line_glyphs[][]

=========================

The whole Arabic part is optional, needless to say.  The good
thing about this very low-level and verbose API is that you have
full control over the data flow and can change data in the middle
steps, if need be.  For example, after getting the bidi types,
you can go on and change the types for new line characters to
make them line breaks, and two consecutive new line characters
paragraph separators.  I would like to make a few changes though:
I like to make mirroring verbose too, such that you call
fribidi_get_mirroring on the str and pass the result to
fribidi_shape, but I'm not sure it's a good idea, since what a
hypothetical fribidi_get_mirroring returns, practically replaces
the original string.  In other words, mirroring is really part of
the shaping process itself. I'm not sure whether a convenience
layer on top to hide most of the details is a good idea.  The
important point about this API is that you never need to loop
over the string yourself, all loopings is done in FriBidi
functions.

</blockquote>


As for log2vis, it should do all the above except for line breaking.
That is, it is supposed to do Arabic shaping too.  No idea why it's not.
Many people have reported to me previously that this code when dropped
in as a replacement for 0.10 versions has suddenly made mplayer play
Arabic subtitles correctly and other similar reports.  As for log2vis
itself, I was thinking about undeprecating it but clearly mark it as a
convenience function that only works for one-line paragraphs, and
possibly add some convenience functions for multi-line paragraphs too.

Anyway, hope these help.

Regards,

behdad



On Thu, 2008-01-17 at 07:38 -0800, Rob Juergens wrote:
> Behdad,
> 
> I downloaded the fribidi2 package and built it.  Thank you very much.
> 
> I noticed that there is a "doc" directory, with man pages for each
> function, but there is no overall picture of how to use these routines.
> I was using fribidi_log2vis(), which worked great for Hebrew, but it
> only "flipped" the Arabic without doing the proper joining.  The docs
> say that fribidi_log2vis() is now deprecated, but I couldn't figure out
> what I should be using instead.
> 
> So, my question is, what the code should look like to do this?  Assume
> the following:
> 
> 1.	I have a null-terminated array of FriBidiChar "str" of length
> "n".
> 2.	I don't know if the string contains Arabic (or Hebrew) chars or
> not.
> 3.	I have already determined that the string may possibly need to
> be flipped by checking if any of the chars are greater than 0xff.
> 4.	The output string would be suitable for printing, using a
> suitable font (such as Arabic Transparent).
> 
> Ideally, I would like a call such as:
> 
> 	fribidi_boolean CheckAndFlipStr (
> 		const FriBidiChar *str_in,
> 		FriBidiStrIndex len_in,
> 		FriBidiStr **str_out_ptr,
> 		FriBidiStrIndex *len_out_ptr)
> 
> Where the function would return a null-terminated output string that was
> malloced and the pointer to it and its length was returned.  The return
> value would be TRUE if flipping was done and an output string was
> created, and FALSE if flipping was not needed (in which case the output
> string pointer would be set to NULL & the length set to 0).  I would
> assume that fribidi_malloc() would have been used to allocate the output
> string and that I should use fribidi_free() to free it.
> 
> Can you help me here?
> 
> Thanks for the help you have already given me, any for any help you can
> give me here.
> 
> Rob
> 
> -----Original Message-----
> From: Behdad Esfahbod [mailto:behdad.esfahbod at gmail.com] On Behalf Of
> Behdad Esfahbod
> Sent: Tuesday, January 15, 2008 3:36 PM
> To: Rob Juergens
> Subject: RE: Using fribidi to switch Arabic chars
> 
> Hi,
> 
> I made a release today just for you :).
> See:
> http://lists.freedesktop.org/archives/fribidi/2008-January/000512.html
> 
> behdad
> 
> On Tue, 2008-01-15 at 07:11 -0800, Rob Juergens wrote:
> > Hi Behdad,
> > 
> > Thanks for the reply.  I went to that site, but I could not figure out
> > how to download the tree (ideally as a tarball) or even individual
> > files.  The "download" button for a file just displays the file in my
> > browser.
> > 
> > What I should be doing here to get the source tree?
> > Thanks,
> > Rob Juergens
> > 
> > -----Original Message-----
> > From: Behdad Esfahbod [mailto:behdad.esfahbod at gmail.com] On Behalf Of
> > Behdad Esfahbod
> > Sent: Monday, January 14, 2008 9:02 PM
> > To: Rob Juergens
> > Subject: Re: Using fribidi to switch Arabic chars
> > 
> > Hi Rob,
> > 
> > For Arabic joining support in FriBidi, you need to check out and use
> the
> > fribidi2 module out of fribidi CVS.  There's no released version of
> that
> > codebase yet.  You can find more information on fribidi.org.
> > 
> > Regards,
> > 
> > behdad
> > 
> > On Mon, 2008-01-14 at 14:14 -0800, Rob Juergens wrote:
> > > Hello Behdad,
> > > 
> > >  
> > > 
> > > I am using the fribidi package (0.10.9), and it seems to work very
> > > well for both Hebrew and Arabic environments.  My problem is that
> the
> > > switched Arabic doesn't change chars that have to be written
> > > differently, that is, the characters in the switched strings do not
> > > "connect" properly.  Am I missing something that I should be doing?
> > > Currently, I am just using the fribidi_log2vis() call, with the
> > > "position_L_to_V_list", "position_V_to_L_list", and
> > > "embedding_level_list" pointers set to NULL.
> > > 
> > >  
> > > 
> > > Any advice you could give me would be appreciated.
> > > 
> > >  
> > > 
> > > Thanks,
> > > 
> > > Rob Juergens
> > > 
> > >  
> > > 
> > > 
-- 
behdad
http://behdad.org/

"Those who would give up Essential Liberty to purchase a little
 Temporary Safety, deserve neither Liberty nor Safety."
        -- Benjamin Franklin, 1759



More information about the fribidi mailing list