[utf-8] Aspell and UTF-8 Update
Noah Levitt
nlevitt@columbia.edu
Sun, 22 Feb 2004 19:49:39 -0500
On Sun, Feb 22, 2004 at 5:19:15 -0500, Kevin Atkinson wrote:
>
> Which languages, with a phonetic like alphabet, do you think that Aspell
> 0.51 will NOT be able to handle do to one of the following?
>
> 1) More than 220 distinct characters?
Not sure what a “phonetic like alphabet” is exactly, or why
aspell should be limited to such scripts, but I assume you
mean to exclude Han and Korean. In that case, it appears
that the only languages you won’t be able to support are
those that use Ethiopic script.
Noah
P.S. Out of curiosity, I did some counting on
fontconfig/fc-lang/*.orth. These numbers are decent
estimates of the number of word characters (i.e.
excluding punctuation, digits, &c.) used in each
language.
aa.orth: 60
ab.orth: 82
af.orth: 67
am.orth: 244
ar.orth: 124
ast.orth: 70
ava.orth: 67
ay.orth: 58
az_ir.orth: 121
az.orth: 145
bam.orth: 58
ba.orth: 81
be.orth: 67
bg.orth: 56
bho.orth: 65
bh.orth: 65
bin.orth: 76
bi.orth: 56
bn.orth: 77
bo.orth: 88
br.orth: 62
bs.orth: 60
bua.orth: 70
ca.orth: 72
ce.orth: 67
chm.orth: 76
ch.orth: 56
chr.orth: 84
co.orth: 82
cs.orth: 80
cu.orth: 94
cv.orth: 74
cy.orth: 76
da.orth: 68
de.orth: 57
dz.orth: 88
el.orth: 66
en.orth: 68
eo.orth: 56
es.orth: 64
et.orth: 62
eu.orth: 54
fa.orth: 120
fi.orth: 60
fj.orth: 50
fo.orth: 66
fr.orth: 82
ful.orth: 59
fur.orth: 62
fy.orth: 73
ga.orth: 78
gd.orth: 68
gez.orth: 194
gl.orth: 64
gn.orth: 68
gu.orth: 67
gv.orth: 52
ha.orth: 57
haw.orth: 56
he.orth: 26
hi.orth: 65
ho.orth: 50
hr.orth: 55
hu.orth: 66
hy.orth: 75
ia.orth: 50
ibo.orth: 56
id.orth: 52
ie.orth: 50
ik.orth: 68
io.orth: 50
is.orth: 68
it.orth: 66
iu.orth: 131
ja.orth: 6538
kaa.orth: 78
ka.orth: 40
ki.orth: 54
kk.orth: 70
kl.orth: 77
km.orth: 69
kn.orth: 68
kok.orth: 65
ko.orth: 16153
ks.orth: 65
ku_ir.orth: 26
kum.orth: 66
ku.orth: 60
kv.orth: 70
kw.orth: 56
ky.orth: 70
la.orth: 61
lb.orth: 73
lez.orth: 67
lo.orth: 53
lt.orth: 59
lv.orth: 63
mg.orth: 54
mh.orth: 60
mi.orth: 56
mk.orth: 38
ml.orth: 68
mn.orth: 125
mo.orth: 123
mr.orth: 65
mt.orth: 66
my.orth: 44
nb.orth: 68
ne.orth: 65
nl.orth: 80
nn.orth: 68
no.orth: 68
ny.orth: 51
oc.orth: 68
om.orth: 50
or.orth: 65
os.orth: 66
pl.orth: 60
ps_af.orth: 44
ps_pk.orth: 44
pt.orth: 80
rm.orth: 64
ro.orth: 58
ru.orth: 67
sah.orth: 76
sa.orth: 65
sco.orth: 53
sel.orth: 66
se.orth: 58
sh.orth: 75
si.orth: 70
sk.orth: 75
sl.orth: 60
sma.orth: 58
smj.orth: 58
smn.orth: 61
sm.orth: 51
sms.orth: 69
so.orth: 50
sq.orth: 54
sr.orth: 75
sv.orth: 66
sw.orth: 50
syr.orth: 43
ta.orth: 36
te.orth: 68
tg.orth: 78
th.orth: 85
ti_er.orth: 240
ti_et.orth: 262
tig.orth: 200
tk.orth: 74
tl.orth: 15
tn.orth: 54
to.orth: 51
tr.orth: 68
ts.orth: 50
tt.orth: 76
tw.orth: 71
tyv.orth: 70
ug.orth: 124
uk.orth: 71
ur.orth: 133
uz.orth: 68
ven.orth: 55
vi.orth: 174
vo.orth: 48
vot.orth: 58
wa.orth: 68
wen.orth: 63
wo.orth: 63
xh.orth: 50
yap.orth: 56
yi.orth: 26
yo.orth: 103
zh_cn.orth: 6765
zh_hk.orth: 2213
zh_mo.orth: 13063
zh_sg.orth: 6765
zh_tw.orth: 13063
zu.orth: 50