[FriBidi-commit] fribidi/gen.tab/unidata ArabicShaping.txt, 1.1, 1.2 BidiMirroring.txt, 1.1.1.1, 1.2 ReadMe.txt, 1.1.1.1, 1.2 UnicodeData.txt, 1.1, 1.2

Behdad Esfahbod behdad at freedesktop.org
Tue Jun 7 00:31:15 PDT 2005


Update of /cvs/fribidi/fribidi/gen.tab/unidata
In directory gabe:/tmp/cvs-serv30029

Modified Files:
	ArabicShaping.txt BidiMirroring.txt ReadMe.txt UnicodeData.txt 
Log Message:
Unicode 4.1 character database update.


Index: ArabicShaping.txt
===================================================================
RCS file: /cvs/fribidi/fribidi/gen.tab/unidata/ArabicShaping.txt,v
retrieving revision 1.1
retrieving revision 1.2
diff -u -d -r1.1 -r1.2
--- ArabicShaping.txt	31 May 2004 10:43:44 -0000	1.1
+++ ArabicShaping.txt	7 Jun 2005 07:31:13 -0000	1.2
@@ -1,8 +1,12 @@
-# ArabicShaping-4.0.1.txt
+# ArabicShaping-4.1.0.txt
+# Date: 2005-03-17, 15:21:00 PST [KW]
 #
 # This file is a normative contributory data file in the
 # Unicode Character Database.
 #
+# Copyright (c) 1991-2005 Unicode, Inc.
+# For terms of use, see http://www.unicode.org/terms_of_use.html
+#
 # This file defines the shaping classes for Arabic and Syriac
 # positional shaping, repeating in machine readable form the
 # information printed in Tables 8-3, 8-7, 8-8, 8-11, 8-12, and
@@ -17,21 +21,42 @@
 #   form, of an Arabic or Syriac character.
 # Field 1: gives a short schematic name for that character,
 #   abbreviated from the normative Unicode character name.
-# Field 2: defines the joining type
-#   R right-joining,
-#   L left-joining,
-#   D dual-joining,
-#   C join-causing
-#   U non-joining
-#   T transparent
+# Field 2: defines the joining type (property name: Joining_Type)
+#   R Right_Joining
+#   L Left_Joining
+#   D Dual_Joining
+#   C Join_Causing
+#   U Non_Joining
+#   T Transparent
 #       See the Arabic block description for more information on these types.
-# Field 3: defines the joining group.
+# Field 3: defines the joining group (property name: Joining_Group)
 #
+# The values of the joining group are based schematically on character
+# names. Where a schematic character name consists of two or more parts separated
+# by spaces, the formal Joining_Group property value, as specified in
+# PropertyValueAliases.txt, consists of the same name parts joined by
+# underscores. Hence, the entry:
+#
+#   0629; TEH MARBUTA; R; TEH MARBUTA
+#
+# corresponds to [Joining_Group = Teh_Marbuta].
+#
+# Note: For historical reasons, the property value [Joining_Group = Hamza_On_Heh_Goal]
+#   is anachronistically named. It used to apply to both of the following characters
+#   in earlier versions of the standard:
+#
+#   U+06C2 ARABIC LETTER HEH GOAL WITH HAMZA ABOVE
+#   U+06C3 ARABIC LETTER TEH MARBUTA GOAL
+#
+#   However, it currently applies only to U+06C3, and *not* to U+06C2.
+#   To avoid destabilizing existing Joining_Group property aliases, the
+#   value Hamza_On_Heh_Goal has not been changed, despite the fact that it
+#   no longer applies to Hamza On Heh Goal, but only to Teh Marbuta Goal.
 #
 # Note: Code points that are not explicitly listed in this file are
-# either of type T or U:
+# either of joining type T or U:
 #
-# - Those that not explicitly listed that are of General Category Mn or Cf
+# - Those that not explicitly listed that are of General Category Mn, Me, or Cf
 #   have joining type T.
 # - All others not explicitly listed have type U.
 #
@@ -46,11 +71,12 @@
 
 # Arabic characters
 
-0600; ARABIC NUMBER SIGN; U; <no shaping>
-0601; ARABIC SIGN SANAH; U; <no shaping>
-0602; ARABIC FOOTNOTE MARKER; U; <no shaping>
-0603; ARABIC SIGN SAFHA; U; <no shaping>
-0621; HAMZA; U; <no shaping>
+0600; ARABIC NUMBER SIGN; U; No_Joining_Group
+0601; ARABIC SIGN SANAH; U; No_Joining_Group
+0602; ARABIC FOOTNOTE MARKER; U; No_Joining_Group
+0603; ARABIC SIGN SAFHA; U; No_Joining_Group
+060B; AFGHANI SIGN; U; No_Joining_Group
+0621; HAMZA; U; No_Joining_Group
 0622; MADDA ON ALEF; R; ALEF
 0623; HAMZA ON ALEF; R; ALEF
 0624; HAMZA ON WAW; R; WAW
@@ -76,7 +102,7 @@
 0638; ZAH; D; TAH
 0639; AIN; D; AIN
 063A; GHAIN; D; AIN
-0640; TATWEEL; C; <no shaping>
+0640; TATWEEL; C; No_Joining_Group
 0641; FEH; D; FEH
 0642; QAF; D; QAF
 0643; KAF; D; KAF
@@ -92,7 +118,7 @@
 0671; HAMZAT WASL ON ALEF; R; ALEF
 0672; WAVY HAMZA ON ALEF; R; ALEF
 0673; WAVY HAMZA UNDER ALEF; R; ALEF
-0674; HIGH HAMZA; U; <no shaping>
+0674; HIGH HAMZA; U; No_Joining_Group
 0675; HIGH HAMZA ALEF; R; ALEF
 0676; HIGH HAMZA WAW; R; WAW
 0677; HIGH HAMZA WAW WITH DAMMA; R; WAW
@@ -145,7 +171,7 @@
 06A6; FEH WITH 4 DOTS ABOVE; D; FEH
 06A7; QAF WITH DOT ABOVE; D; QAF
 06A8; QAF WITH 3 DOTS ABOVE; D; QAF
-06A9; OPEN KAF; D; GAF
+06A9; KEHEH; D; GAF
 06AA; SWASH KAF; D; SWASH KAF
 06AB; KAF WITH RING; D; GAF
 06AC; KAF WITH DOT ABOVE; D; KAF
@@ -170,7 +196,7 @@
 06BF; HAH WITH MIDDLE 3 DOTS DOWNWARD AND DOT ABOVE; D; HAH
 06C0; HAMZA ON HEH; R; TEH MARBUTA
 06C1; HEH GOAL; D; HEH GOAL
-06C2; HAMZA ON HEH GOAL; R; HAMZA ON HEH GOAL
+06C2; HAMZA ON HEH GOAL; D; HEH GOAL
 06C3; TEH MARBUTA GOAL; R; HAMZA ON HEH GOAL
 06C4; WAW WITH RING; R; WAW
 06C5; WAW WITH BAR; R; WAW
@@ -189,7 +215,7 @@
 06D2; YEH BARREE; R; YEH BARREE
 06D3; HAMZA ON YEH BARREE; R; YEH BARREE
 06D5; AE; R; TEH MARBUTA
-06DD; ARABIC END OF AYAH; U; <no shaping>
+06DD; ARABIC END OF AYAH; U; No_Joining_Group
 06EE; DAL WITH INVERTED V; R; DAL
 06EF; REH WITH INVERTED V; R; REH
 06FA; SEEN WITH DOT BELOW AND 3 DOTS ABOVE; D; SEEN
@@ -234,7 +260,40 @@
 074E; SOGDIAN KHAPH; D; KHAPH
 074F; SOGDIAN FE; D; FE
 
+# Arabic supplement characters
+
+0750; BEH WITH 3 DOTS HORIZONTALLY BELOW; D; BEH
+0751; BEH WITH DOT BELOW AND 3 DOTS ABOVE; D; BEH
+0752; BEH WITH 3 DOTS POINTING UPWARDS BELOW; D; BEH
+0753; BEH WITH 3 DOTS POINTING UPWARDS BELOW AND 2 DOTS ABOVE; D; BEH
+0754; BEH WITH 2 DOTS BELOW AND DOT ABOVE; D; BEH
+0755; BEH WITH INVERTED SMALL V BELOW; D; BEH
+0756; BEH WITH SMALL V; D; BEH
+0757; HAH WITH 2 DOTS ABOVE; D; HAH
+0758; HAH WITH 3 DOTS POINTING UPWARDS BELOW; D; HAH
+0759; DAL WITH 2 DOTS VERTICALLY BELOW AND SMALL TAH; R; DAL
+075A; DAL WITH INVERTED SMALL V BELOW; R; DAL
+075B; REH WITH STROKE; R; REH
+075C; SEEN WITH 4 DOTS ABOVE; D; SEEN
+075D; AIN WITH 2 DOTS ABOVE; D; AIN
+075E; AIN WITH 3 DOTS POINTING DOWNWARDS ABOVE; D; AIN
+075F; AIN WITH 2 DOTS VERTICALLY ABOVE; D; AIN
+0760; FEH WITH 2 DOTS BELOW; D; FEH
+0761; FEH WITH 3 DOTS POINTING UPWARDS BELOW; D; FEH
+0762; KEHEH WITH DOT ABOVE; D; GAF
+0763; KEHEH WITH 3 DOTS ABOVE; D; GAF
+0764; KEHEH WITH 3 DOTS POINTING UPWARDS BELOW; D; GAF
+0765; MEEM WITH DOT ABOVE; D; MEEM
+0766; MEEM WITH DOT BELOW; D; MEEM
+0767; NOON WITH 2 DOTS BELOW; D; NOON
+0768; NOON WITH SMALL TAH; D; NOON
+0769; NOON WITH SMALL V; D; NOON
+076A; LAM WITH BAR; D; LAM
+076B; REH WITH 2 DOTS VERTICALLY ABOVE; R; REH
+076C; REH WITH HAMZA ABOVE; R; REH
+076D; SEEN WITH 2 DOTS VERTICALLY ABOVE; D; SEEN
+
 # Other
 
-200D; ZERO WIDTH JOINER; C; <no shaping>
-200C; ZERO WIDTH NON-JOINER; U; <no shaping>
+200D; ZERO WIDTH JOINER; C; No_Joining_Group
+200C; ZERO WIDTH NON-JOINER; U; No_Joining_Group

Index: BidiMirroring.txt
===================================================================
RCS file: /cvs/fribidi/fribidi/gen.tab/unidata/BidiMirroring.txt,v
retrieving revision 1.1.1.1
retrieving revision 1.2
diff -u -d -r1.1.1.1 -r1.2
--- BidiMirroring.txt	25 Apr 2004 18:47:57 -0000	1.1.1.1
+++ BidiMirroring.txt	7 Jun 2005 07:31:13 -0000	1.2
@@ -1,10 +1,18 @@
-# BidiMirroring-4.0.0.txt
+# BidiMirroring-4.1.0.txt
+# Date: 2005-03-17, 15:21:00 PST [KW]
+#
+# Bidi_Mirroring_Glyph Property
 # 
-# This file is an informative supplement to the UnicodeData file. It
-# lists characters that have the mirrored property
+# This file is an informative contributory data file in the
+# Unicode Character Database.
+#
+# Copyright (c) 1991-2005 Unicode, Inc.
+# For terms of use, see http://www.unicode.org/terms_of_use.html
+#
+# This data file lists characters that have the mirrored property
 # where there is another Unicode character that typically has a glyph
 # that is the mirror image of the original character's glyph.
-# The repertoire covered by the file is Unicode 4.0.0.
+# The repertoire covered by the file is Unicode 4.1.0.
 # 
 # The file contains a list of lines with mappings from one code point
 # to another one for character-based mirroring.
@@ -24,7 +32,7 @@
 # at http://www.unicode.org/unicode/reports/tr9/
 # 
 # This file was originally created by Markus Scherer.
-# Extended for Unicode 3.2 and 4.0 by Ken Whistler.
+# Extended for Unicode 3.2, 4.0, and 4.1 by Ken Whistler.
 # 
 # ############################################################
 
@@ -180,6 +188,10 @@
 2773; 2772 # LIGHT RIGHT TORTOISE SHELL BRACKET
 2774; 2775 # MEDIUM LEFT CURLY BRACKET ORNAMENT
 2775; 2774 # MEDIUM RIGHT CURLY BRACKET ORNAMENT
+27C3; 27C4 # OPEN SUBSET
+27C4; 27C3 # OPEN SUPERSET
+27C5; 27C6 # LEFT S-SHAPED BAG DELIMITER
+27C6; 27C5 # RIGHT S-SHAPED BAG DELIMITER
 27D5; 27D6 # LEFT OUTER JOIN
 27D6; 27D5 # RIGHT OUTER JOIN
 27DD; 27DE # LONG RIGHT TACK
@@ -238,7 +250,7 @@
 29FD; 29FC # RIGHT-POINTING CURVED ANGLE BRACKET
 2A2B; 2A2C # MINUS SIGN WITH FALLING DOTS
 2A2C; 2A2B # MINUS SIGN WITH RISING DOTS
-2A2D; 2A2C # PLUS SIGN IN LEFT HALF CIRCLE
+2A2D; 2A2E # PLUS SIGN IN LEFT HALF CIRCLE
 2A2E; 2A2D # PLUS SIGN IN RIGHT HALF CIRCLE
 2A34; 2A35 # MULTIPLICATION SIGN IN LEFT HALF CIRCLE
 2A35; 2A34 # MULTIPLICATION SIGN IN RIGHT HALF CIRCLE
@@ -316,6 +328,16 @@
 2AF8; 2AF7 # TRIPLE NESTED GREATER-THAN
 2AF9; 2AFA # DOUBLE-LINE SLANTED LESS-THAN OR EQUAL TO
 2AFA; 2AF9 # DOUBLE-LINE SLANTED GREATER-THAN OR EQUAL TO
+2E02; 2E03 # LEFT SUBSTITUTION BRACKET
+2E03; 2E02 # RIGHT SUBSTITUTION BRACKET
+2E04; 2E05 # LEFT DOTTED SUBSTITUTION BRACKET
+2E05; 2E04 # RIGHT DOTTED SUBSTITUTION BRACKET
+2E09; 2E0A # LEFT TRANSPOSITION BRACKET
+2E0A; 2E09 # RIGHT TRANSPOSITION BRACKET
+2E0C; 2E0D # LEFT RAISED OMISSION BRACKET
+2E0D; 2E0C # RIGHT RAISED OMISSION BRACKET
+2E1C; 2E1D # LEFT LOW PARAPHRASE BRACKET
+2E1D; 2E1C # RIGHT LOW PARAPHRASE BRACKET
 3008; 3009 # LEFT ANGLE BRACKET
 3009; 3008 # RIGHT ANGLE BRACKET
 300A; 300B # LEFT DOUBLE ANGLE BRACKET
@@ -347,7 +369,9 @@
 FF62; FF63 # [BEST FIT] HALFWIDTH LEFT CORNER BRACKET
 FF63; FF62 # [BEST FIT] HALFWIDTH RIGHT CORNER BRACKET
 
-# The following characters have no appropriate mirroring character
+# The following characters have no appropriate mirroring character.
+# For these characters it is up to the rendering system
+#   to provide mirrored glyphs.
 
 # 2140; DOUBLE-STRUCK N-ARY SUMMATION
 # 2201; COMPLEMENT
@@ -410,6 +434,7 @@
 # 22FF; Z NOTATION BAG MEMBERSHIP
 # 2320; TOP HALF INTEGRAL
 # 2321; BOTTOM HALF INTEGRAL
+# 27C0; THREE DIMENSIONAL ANGLE
 # 27D3; LOWER RIGHT CORNER WITH DOT
 # 27D4; UPPER LEFT CORNER WITH DOT
 # 27DC; LEFT MULTIMAP

Index: ReadMe.txt
===================================================================
RCS file: /cvs/fribidi/fribidi/gen.tab/unidata/ReadMe.txt,v
retrieving revision 1.1.1.1
retrieving revision 1.2
diff -u -d -r1.1.1.1 -r1.2
--- ReadMe.txt	25 Apr 2004 18:47:57 -0000	1.1.1.1
+++ ReadMe.txt	7 Jun 2005 07:31:13 -0000	1.2
@@ -1,40 +1,27 @@
-2004 March 30
-
-This directory contains the Unicode Character Database
-data files.
-
-Currently, the Unicode Character Database files are at
-the version level:
-
-   Unicode Standard, Version 4.0.1
-
-For information about the standard itself, see:
-
-http://www.unicode.org/versions/Unicode4.0.1/
+2005 March 30
 
-Detailed documentation of the files constituting the
-Unicode Character Database (contributory data files for
-the standard itself) can be found in UCD.html.
+Welcome to the Unicode Character Database
 
-Unihan.txt is a very large file. A zipped version is
-also provided for downloading convenience: Unihan.zip.
+This directory contains contributory data files
+for the Unicode Character Database of the Unicode Standard. 
 
-The current Unicode 4.0.1 version of Unihan.txt is also available in
-two compressed formats in the Unicode 4.0.1 update directory. See:
-http://www.unicode.org/Public/4.0-Update1/
-or
-ftp://ftp.unicode.org/Public/4.0-Update1/
+Copyright (c) 1991-2005 Unicode, Inc.
+For terms of use, see http://www.unicode.org/terms_of_use.html
 
-Unihan-4.0.1.zip for Windows. (Use winzip)
-Unihan-4.0.1.txt.gz  for Unix.    (Use gzip or gunzip)
+For an overview of how to access a specific version of 
+the Unicode Character Database (UCD) and other information, see:
 
-Note that the files are zipped in
-exactly the same format they have on the server (with Unix
-line endings). From a browser, right-clicking on 
-Unihan.zip will allow automatic download and unzip on a
-Windows system with winzip installed.
+http://www.unicode.org/ucd/
 
+If you accessed this file via the URL:
 
+http://www.unicode.org/Public/UNIDATA/ReadMe.txt
 
+then you are looking at the most current version of the UCD. 
+Otherwise the version number of the UCD is part of the path name. 
 
+The file UCD.html in this directory, as well as any file 
+headers, where present, also identify the version of the UCD.
 
+=== added ===
+Unicode Standard, Version 4.1.0

Index: UnicodeData.txt
===================================================================
RCS file: /cvs/fribidi/fribidi/gen.tab/unidata/UnicodeData.txt,v
retrieving revision 1.1
retrieving revision 1.2
diff -u -d -r1.1 -r1.2
--- UnicodeData.txt	31 May 2004 10:43:44 -0000	1.1
+++ UnicodeData.txt	7 Jun 2005 07:31:13 -0000	1.2
@@ -408,7 +408,7 @@
 0197;LATIN CAPITAL LETTER I WITH STROKE;Lu;0;L;;;;;N;LATIN CAPITAL LETTER BARRED I;;;0268;
 0198;LATIN CAPITAL LETTER K WITH HOOK;Lu;0;L;;;;;N;LATIN CAPITAL LETTER K HOOK;;;0199;
 0199;LATIN SMALL LETTER K WITH HOOK;Ll;0;L;;;;;N;LATIN SMALL LETTER K HOOK;;0198;;0198
-019A;LATIN SMALL LETTER L WITH BAR;Ll;0;L;;;;;N;LATIN SMALL LETTER BARRED L;;;;
+019A;LATIN SMALL LETTER L WITH BAR;Ll;0;L;;;;;N;LATIN SMALL LETTER BARRED L;;023D;;023D
 019B;LATIN SMALL LETTER LAMBDA WITH STROKE;Ll;0;L;;;;;N;LATIN SMALL LETTER BARRED LAMBDA;;;;
 019C;LATIN CAPITAL LETTER TURNED M;Lu;0;L;;;;;N;;;;026F;
 019D;LATIN CAPITAL LETTER N WITH LEFT HOOK;Lu;0;L;;;;;N;LATIN CAPITAL LETTER N HOOK;;;0272;
@@ -565,6 +565,17 @@
 0234;LATIN SMALL LETTER L WITH CURL;Ll;0;L;;;;;N;;;;;
 0235;LATIN SMALL LETTER N WITH CURL;Ll;0;L;;;;;N;;;;;
[...1829 lines suppressed...]
+1D23E;GREEK INSTRUMENTAL NOTATION SYMBOL-51;So;0;ON;;;;;N;;;;;
+1D23F;GREEK INSTRUMENTAL NOTATION SYMBOL-52;So;0;ON;;;;;N;;;;;
+1D240;GREEK INSTRUMENTAL NOTATION SYMBOL-53;So;0;ON;;;;;N;;;;;
+1D241;GREEK INSTRUMENTAL NOTATION SYMBOL-54;So;0;ON;;;;;N;;;;;
+1D242;COMBINING GREEK MUSICAL TRISEME;Mn;230;NSM;;;;;N;;;;;
+1D243;COMBINING GREEK MUSICAL TETRASEME;Mn;230;NSM;;;;;N;;;;;
+1D244;COMBINING GREEK MUSICAL PENTASEME;Mn;230;NSM;;;;;N;;;;;
+1D245;GREEK MUSICAL LEIMMA;So;0;ON;;;;;N;;;;;
 1D300;MONOGRAM FOR EARTH;So;0;ON;;;;;N;;;;;
 1D301;DIGRAM FOR HEAVENLY EARTH;So;0;ON;;;;;N;;;;;
 1D302;DIGRAM FOR HUMAN EARTH;So;0;ON;;;;;N;;;;;
@@ -13873,6 +15122,8 @@
 1D6A1;MATHEMATICAL MONOSPACE SMALL X;Ll;0;L;<font> 0078;;;;N;;;;;
 1D6A2;MATHEMATICAL MONOSPACE SMALL Y;Ll;0;L;<font> 0079;;;;N;;;;;
 1D6A3;MATHEMATICAL MONOSPACE SMALL Z;Ll;0;L;<font> 007A;;;;N;;;;;
+1D6A4;MATHEMATICAL ITALIC SMALL DOTLESS I;Ll;0;L;<font> 0131;;;;N;;;;;
+1D6A5;MATHEMATICAL ITALIC SMALL DOTLESS J;Ll;0;L;<font> 0237;;;;N;;;;;
 1D6A8;MATHEMATICAL BOLD CAPITAL ALPHA;Lu;0;L;<font> 0391;;;;N;;;;;
 1D6A9;MATHEMATICAL BOLD CAPITAL BETA;Lu;0;L;<font> 0392;;;;N;;;;;
 1D6AA;MATHEMATICAL BOLD CAPITAL GAMMA;Lu;0;L;<font> 0393;;;;N;;;;;



More information about the fribidi-commit mailing list