[Fribidi-discuss] Bugfix patches: --debug and remove_marks

Beni Cherniavsky cben at techunix.technion.ac.il
Fri May 23 07:12:03 EST 2003


While debugging a wrapper script [1]_ around fribidi for creating
examples of bidi in reStructureText [2]_ documents (I'll post one soon
;-), I bumped into a series of bugs in fribidi.

The script displayed embedding --levels and used --clean (because UAX9
doesn't define levels of explicit bidi codes, so I wanted to avoid
them in examples).  The levels were obviously wrong, e.g.:

$ fribidi --caprtl --showinput --nopad --levels --ltov --vtol
ABC_>def   => _>defCBA
6 5 4 0 1 2 3
3 4 5 6 2 1 0
1 1 1 2 2 2 2

$ fribidi --caprtl --showinput --nopad --levels --ltov --vtol --clean
ABC_>def   => defCBA
6 4 3 -1 0 1
4 5 6 2 1 0
1 1 2 2 2 2

Turns out the ltov is wrong too.  It seems that the intention in
`fribidi_remove_bidi_marks()` was to map to the original positions
(which makes sense) so I concluded that the ltov and vtol should be of
different lengths:


---> cut here <---

diff -ru -w ../fribidi-cvs/fribidi.c ./fribidi.c
--- ../fribidi-cvs/fribidi.c	2003-05-23 13:53:45.000000000 +0300
+++ ./fribidi.c	2003-05-23 16:12:01.000000000 +0300
@@ -1084,6 +1084,11 @@
 /*======================================================================
  *  fribidi_remove_bidi_marks() removes bidirectional marks, and returns
  *  the new length, updates each of other inputs if not NULL.
+ *
+ *  Note that position_from_this_list might become shorter but refers to
+ *  the orginal positions and position_to_this_list remains of the same
+ *  length but contains -1 values at positions of removed marks.
  *----------------------------------------------------------------------*/
 FRIBIDI_API FriBidiStrIndex
 fribidi_remove_bidi_marks (FriBidiEnv *fribidienv,
@@ -1128,7 +1133,7 @@
       DBG ("  Converting from_this list to to_this\n");
       for (i = 0; i < length; i++)
 	position_to_this_list[i] = -1;
-      for (i = 0; i < length; i++)
+      for (i = 0; i < j; i++)
 	position_to_this_list[position_from_this_list[i]] = i;
       DBG ("  Converting from_this list to to_this, Done\n");
     }

 /*======================================================================
diff -ru -w ../fribidi-cvs/fribidi_main.c ./fribidi_main.c
--- ../fribidi-cvs/fribidi_main.c	2003-05-23 13:53:16.000000000 +0300
+++ ./fribidi_main.c	2003-05-23 15:49:02.000000000 +0300
@@ -421,6 +421,7 @@
 					 visual, ltov, vtol, levels);
 	      if (log2vis)
 		{
+                  int orig_len = len; /* before removing explicit marks */

 		  if (show_input)
 		    printf ("%-*s => ", padding_width, S_);
@@ -513,7 +514,8 @@
 		      FriBidiStrIndex i;

 		      printf (nl_found);
-		      for (i = 0; i < len; i++)
+                      /* logical positions include explicit marks */
+		      for (i = 0; i < orig_len; i++)
 			printf ("%ld ", (long) ltov[i]);
 		      nl_found = "\n";
 		    }

---> cut here <---

This solves the ltov:

$ fribidi --caprtl --showinput --nopad --levels --ltov --vtol
ABC_>def   => _>defCBA
6 5 4 0 1 2 3
3 4 5 6 2 1 0
1 1 1 2 2 2 2
$ fribidi --caprtl --showinput --nopad --levels --ltov --vtol --clean
ABC_>def   => defCBA
5 4 3 -1 0 1 2
4 5 6 2 1 0
1 1 2 2 2 2

But not the levels.  A short investigation shows that the deleted
level is in position 0 - which is the visual position of the LRM,
whereas the levels list is in logical order!  So I'm not sure how to
solve this.

Also, when I started investigating, I first rebuilt with configure
--enable-debug.  This didn't compile!  A debug-only funtion was
missing the fribidienv parameter:


diff -ru -w ../fribidi-cvs/fribidi.c ./fribidi.c
--- ../fribidi-cvs/fribidi.c	2003-05-23 13:53:45.000000000 +0300
+++ ./fribidi.c	2003-05-23 16:12:01.000000000 +0300
@@ -566,7 +566,7 @@
 /* Here, only for test porpuses, we have assumed that a fribidi_string
    ends with a 0 character */
 static void
-print_bidi_string (FriBidiChar *str)
+print_bidi_string (FriBidiEnv *fribidienv, const FriBidiChar *str)
 {
   FriBidiStrIndex i;
   fprintf (stderr, "  Org. types : ");
@@ -751,7 +751,7 @@
   if (fribidi_debug_status (fribidienv))
     {
       print_types_re (type_rl_list);
-      print_bidi_string (str);
+      print_bidi_string (fribidienv, str);
       print_resolved_levels (type_rl_list);
       print_resolved_types (type_rl_list);
     }
@@ -950,7 +950,7 @@
 #ifdef DEBUG
   if (fribidi_debug_status (fribidienv))
     {
-      print_bidi_string (str);
+      print_bidi_string (fribidienv, str);
       print_resolved_levels (type_rl_list);
       print_resolved_types (type_rl_list);
     }


After fixing this and compiling, it still complained when called with
``-d`` that I didn't compile with DEBUG.  I traced this to <config.h>
missing from fribidi_env.c (fix below) but I wonder why it's included
in a fragile way in every file instead of doing this in e.g.
"fribidi_config.h"?


diff -ru -w ../fribidi-cvs/fribidi_env.c ./fribidi_env.c
--- ../fribidi-cvs/fribidi_env.c	2003-05-23 13:53:16.000000000 +0300
+++ ./fribidi_env.c	2003-05-23 14:28:14.000000000 +0300
@@ -45,6 +45,9 @@

 #include <stdlib.h>

+#ifdef HAVE_CONFIG_H
+#include <config.h>
+#endif /* HAVE_CONFIG_H */
 #include "fribidi_env.h"

 /*======================================================================


Finally, there was this extra semicolon giving a warning:


diff -ru -w ../fribidi-cvs/fribidi_types.c ./fribidi_types.c
--- ../fribidi-cvs/fribidi_types.c	2003-05-23 13:53:16.000000000 +0300
+++ ./fribidi_types.c	2003-05-23 13:58:02.000000000 +0300
@@ -76,7 +76,7 @@
     default:
       return '?';
     }
-};
+}

 #endif


Another problem was with ``make check``.  It blatantly failed because
the output of fribidi --showinput depends on the size of the screen.
``COLUMNS=80 make check`` passed.  This is quite ridiculous.  I
suggest to make --showinput display in on another line, which would be
much more robust.  I also think the tests should include more than the
output, at least also the levels (then these --clean issues would go
unnoticed).  I also think each line could be prefixed by its type and
matching chars properly aligned, somewhat like my script currently
does::

:Logical:  ``THE <LRE>file "SOME name" wasn't<PDF> FOUND.``
:Base dir: R
:Levels:   ``1111     12222223333222222222222     2211111``
:Visual:   ``.DNUOF file "EMOS name" wasn't EHT``

(here you see again that the levels are wrong).  The ltov/vtol would
frequently not fit into a single char; perhaps base 36 should be
employed...  BTW, `char_from_level_array` allows for only 16 levels,
is this an artifact of lower limits in early TR9 versions?  I also saw
a refernce to a maximum of 15 in
<http://www.w3.org/TR/REC-CSS2/visuren.html#direction>.

-- 
Beni Cherniavsky <cben at users.sf.net>

.. [1] I still need to polish it but it already works and is extremely
       convenient (I bound C-c b in emacs to filtering the region
       through it in-place).  Current version at:
       http://www.technion.ac.il/~cben/python/rstbidi.py

.. [2] http://docutils.sourceforge.net/rst.html




More information about the FriBidi mailing list