[Fontconfig] Supporting Unicode variation selectors

Behdad Esfahbod behdad at behdad.org
Wed Jun 17 19:13:00 PDT 2015


Hi everyone,

Currently fontconfig does not support Unicode variation selectors.  Lets fix that.

Unicode defines 256 generic variation selectors, in the following ranges:

U+FE00 VARIATION SELECTOR-1..U+FE0F VARIATION SELECTOR-16
U+E0100 VARIATION SELECTOR-17..U+E01EF VARIATION SELECTOR-256

OpenType encodes those in cmap subtable format 14.  Fonts as such can encode
pairs of characters in the cmap instead of one.  The second of the pair is
supposed to be a variation selector, though nothing in the table format
enforces that.

To support these in fontconfig, two changes are needed:

  - Extend FcCharSet to be able to carry variation sequences as well,

  - Add a FcFreeTypeCharIndex variant that takes a variation selector. (ala
FT_Face_GetCharVariantIndex),

Adding the latter is rather trivial.  For the former, it would be easiest if
we encode the variation selector and the base Unicode character in one 32-bit
integer, and make sure FcCharSet handles that efficiently (this probably is
currently not the case).

Ideally, we'd want to encode the sequence U followed by VSx (where VSx is the
VARIATION SELECTOR-x) as (U + VSx << 24).  This will use the high byte of the
32bit unsigned for the variation selector number.  The only problem is: there
are 256, not 255, variation selectors.  I submitted a proposal to Unicode to
commit to not use the last one, but that was not accepted.  Currently up to
~240 are used.

Failing that most beautiful scheme, we can use a different shift.  21 would be
the next most natural, given that Unicode numbers fit 21 bits.  It just would
be much harder to read a hex of a 32bit number in the FcCharSet verbose output
and know what it means.

So, what do people think?  Lets make this happen.

Thanks,
-- 
behdad
http://behdad.org/



More information about the Fontconfig mailing list