[Fribidi-discuss] Bugfix patches: --debug and remove_marks
Beni Cherniavsky
cben at techunix.technion.ac.il
Fri May 23 07:12:03 EST 2003
While debugging a wrapper script [1]_ around fribidi for creating
examples of bidi in reStructureText [2]_ documents (I'll post one soon
;-), I bumped into a series of bugs in fribidi.
The script displayed embedding --levels and used --clean (because UAX9
doesn't define levels of explicit bidi codes, so I wanted to avoid
them in examples). The levels were obviously wrong, e.g.:
$ fribidi --caprtl --showinput --nopad --levels --ltov --vtol
ABC_>def => _>defCBA
6 5 4 0 1 2 3
3 4 5 6 2 1 0
1 1 1 2 2 2 2
$ fribidi --caprtl --showinput --nopad --levels --ltov --vtol --clean
ABC_>def => defCBA
6 4 3 -1 0 1
4 5 6 2 1 0
1 1 2 2 2 2
Turns out the ltov is wrong too. It seems that the intention in
`fribidi_remove_bidi_marks()` was to map to the original positions
(which makes sense) so I concluded that the ltov and vtol should be of
different lengths:
---> cut here <---
diff -ru -w ../fribidi-cvs/fribidi.c ./fribidi.c
--- ../fribidi-cvs/fribidi.c 2003-05-23 13:53:45.000000000 +0300
+++ ./fribidi.c 2003-05-23 16:12:01.000000000 +0300
@@ -1084,6 +1084,11 @@
/*======================================================================
* fribidi_remove_bidi_marks() removes bidirectional marks, and returns
* the new length, updates each of other inputs if not NULL.
+ *
+ * Note that position_from_this_list might become shorter but refers to
+ * the orginal positions and position_to_this_list remains of the same
+ * length but contains -1 values at positions of removed marks.
*----------------------------------------------------------------------*/
FRIBIDI_API FriBidiStrIndex
fribidi_remove_bidi_marks (FriBidiEnv *fribidienv,
@@ -1128,7 +1133,7 @@
DBG (" Converting from_this list to to_this\n");
for (i = 0; i < length; i++)
position_to_this_list[i] = -1;
- for (i = 0; i < length; i++)
+ for (i = 0; i < j; i++)
position_to_this_list[position_from_this_list[i]] = i;
DBG (" Converting from_this list to to_this, Done\n");
}
/*======================================================================
diff -ru -w ../fribidi-cvs/fribidi_main.c ./fribidi_main.c
--- ../fribidi-cvs/fribidi_main.c 2003-05-23 13:53:16.000000000 +0300
+++ ./fribidi_main.c 2003-05-23 15:49:02.000000000 +0300
@@ -421,6 +421,7 @@
visual, ltov, vtol, levels);
if (log2vis)
{
+ int orig_len = len; /* before removing explicit marks */
if (show_input)
printf ("%-*s => ", padding_width, S_);
@@ -513,7 +514,8 @@
FriBidiStrIndex i;
printf (nl_found);
- for (i = 0; i < len; i++)
+ /* logical positions include explicit marks */
+ for (i = 0; i < orig_len; i++)
printf ("%ld ", (long) ltov[i]);
nl_found = "\n";
}
---> cut here <---
This solves the ltov:
$ fribidi --caprtl --showinput --nopad --levels --ltov --vtol
ABC_>def => _>defCBA
6 5 4 0 1 2 3
3 4 5 6 2 1 0
1 1 1 2 2 2 2
$ fribidi --caprtl --showinput --nopad --levels --ltov --vtol --clean
ABC_>def => defCBA
5 4 3 -1 0 1 2
4 5 6 2 1 0
1 1 2 2 2 2
But not the levels. A short investigation shows that the deleted
level is in position 0 - which is the visual position of the LRM,
whereas the levels list is in logical order! So I'm not sure how to
solve this.
Also, when I started investigating, I first rebuilt with configure
--enable-debug. This didn't compile! A debug-only funtion was
missing the fribidienv parameter:
diff -ru -w ../fribidi-cvs/fribidi.c ./fribidi.c
--- ../fribidi-cvs/fribidi.c 2003-05-23 13:53:45.000000000 +0300
+++ ./fribidi.c 2003-05-23 16:12:01.000000000 +0300
@@ -566,7 +566,7 @@
/* Here, only for test porpuses, we have assumed that a fribidi_string
ends with a 0 character */
static void
-print_bidi_string (FriBidiChar *str)
+print_bidi_string (FriBidiEnv *fribidienv, const FriBidiChar *str)
{
FriBidiStrIndex i;
fprintf (stderr, " Org. types : ");
@@ -751,7 +751,7 @@
if (fribidi_debug_status (fribidienv))
{
print_types_re (type_rl_list);
- print_bidi_string (str);
+ print_bidi_string (fribidienv, str);
print_resolved_levels (type_rl_list);
print_resolved_types (type_rl_list);
}
@@ -950,7 +950,7 @@
#ifdef DEBUG
if (fribidi_debug_status (fribidienv))
{
- print_bidi_string (str);
+ print_bidi_string (fribidienv, str);
print_resolved_levels (type_rl_list);
print_resolved_types (type_rl_list);
}
After fixing this and compiling, it still complained when called with
``-d`` that I didn't compile with DEBUG. I traced this to <config.h>
missing from fribidi_env.c (fix below) but I wonder why it's included
in a fragile way in every file instead of doing this in e.g.
"fribidi_config.h"?
diff -ru -w ../fribidi-cvs/fribidi_env.c ./fribidi_env.c
--- ../fribidi-cvs/fribidi_env.c 2003-05-23 13:53:16.000000000 +0300
+++ ./fribidi_env.c 2003-05-23 14:28:14.000000000 +0300
@@ -45,6 +45,9 @@
#include <stdlib.h>
+#ifdef HAVE_CONFIG_H
+#include <config.h>
+#endif /* HAVE_CONFIG_H */
#include "fribidi_env.h"
/*======================================================================
Finally, there was this extra semicolon giving a warning:
diff -ru -w ../fribidi-cvs/fribidi_types.c ./fribidi_types.c
--- ../fribidi-cvs/fribidi_types.c 2003-05-23 13:53:16.000000000 +0300
+++ ./fribidi_types.c 2003-05-23 13:58:02.000000000 +0300
@@ -76,7 +76,7 @@
default:
return '?';
}
-};
+}
#endif
Another problem was with ``make check``. It blatantly failed because
the output of fribidi --showinput depends on the size of the screen.
``COLUMNS=80 make check`` passed. This is quite ridiculous. I
suggest to make --showinput display in on another line, which would be
much more robust. I also think the tests should include more than the
output, at least also the levels (then these --clean issues would go
unnoticed). I also think each line could be prefixed by its type and
matching chars properly aligned, somewhat like my script currently
does::
:Logical: ``THE <LRE>file "SOME name" wasn't<PDF> FOUND.``
:Base dir: R
:Levels: ``1111 12222223333222222222222 2211111``
:Visual: ``.DNUOF file "EMOS name" wasn't EHT``
(here you see again that the levels are wrong). The ltov/vtol would
frequently not fit into a single char; perhaps base 36 should be
employed... BTW, `char_from_level_array` allows for only 16 levels,
is this an artifact of lower limits in early TR9 versions? I also saw
a refernce to a maximum of 15 in
<http://www.w3.org/TR/REC-CSS2/visuren.html#direction>.
--
Beni Cherniavsky <cben at users.sf.net>
.. [1] I still need to polish it but it already works and is extremely
convenient (I bound C-c b in emacs to filtering the region
through it in-place). Current version at:
http://www.technion.ac.il/~cben/python/rstbidi.py
.. [2] http://docutils.sourceforge.net/rst.html
More information about the FriBidi
mailing list