[Fontconfig] Supporting Unicode variation selectors

Wed Jun 17 20:15:38 PDT 2015

Hi,

I guess current discussion is focused to the variation
selectors itself, and, you have no attempt to make
fontconfig to handle IVS-specific info. For example,
a question "this font supports the IVSs defined for
Adobe-Japan1 /or not" might be the future task and
separated from the current discussion.
Am I understanding correctly?

Regards,
mpsuzuki

Behdad Esfahbod wrote:
> Hi everyone,
> 
> Currently fontconfig does not support Unicode variation selectors.  Lets fix that.
> 
> Unicode defines 256 generic variation selectors, in the following ranges:
> 
> U+FE00 VARIATION SELECTOR-1..U+FE0F VARIATION SELECTOR-16
> U+E0100 VARIATION SELECTOR-17..U+E01EF VARIATION SELECTOR-256
> 
> OpenType encodes those in cmap subtable format 14.  Fonts as such can encode
> pairs of characters in the cmap instead of one.  The second of the pair is
> supposed to be a variation selector, though nothing in the table format
> enforces that.
> 
> To support these in fontconfig, two changes are needed:
> 
>   - Extend FcCharSet to be able to carry variation sequences as well,
> 
>   - Add a FcFreeTypeCharIndex variant that takes a variation selector. (ala
> FT_Face_GetCharVariantIndex),
> 
> Adding the latter is rather trivial.  For the former, it would be easiest if
> we encode the variation selector and the base Unicode character in one 32-bit
> integer, and make sure FcCharSet handles that efficiently (this probably is
> currently not the case).
> 
> Ideally, we'd want to encode the sequence U followed by VSx (where VSx is the
> VARIATION SELECTOR-x) as (U + VSx << 24).  This will use the high byte of the
> 32bit unsigned for the variation selector number.  The only problem is: there
> are 256, not 255, variation selectors.  I submitted a proposal to Unicode to
> commit to not use the last one, but that was not accepted.  Currently up to
> ~240 are used.
> 
> Failing that most beautiful scheme, we can use a different shift.  21 would be
> the next most natural, given that Unicode numbers fit 21 bits.  It just would
> be much harder to read a hex of a 32bit number in the FcCharSet verbose output
> and know what it means.
> 
> So, what do people think?  Lets make this happen.
> 
> Thanks,