[Fontconfig] Supporting Unicode variation selectors

suzuki toshiya mpsuzuki at hiroshima-u.ac.jp
Wed Jun 17 20:28:09 PDT 2015


Good to hear about BCP 47, thanks.
SC34/WG2 has once submitted a proposal to SC29/WG11
to add a nameID to store IVS collection name in OpenType,
but it was not welcomed (not rejected but postponed
for more discussion is needed).

Regards,
mpsuzuki

Behdad Esfahbod wrote:
> On 15-06-17 08:15 PM, suzuki toshiya wrote:
>> Hi,
>>
>> I guess current discussion is focused to the variation
>> selectors itself, and, you have no attempt to make
>> fontconfig to handle IVS-specific info. For example,
>> a question "this font supports the IVSs defined for
>> Adobe-Japan1 /or not" might be the future task and
>> separated from the current discussion.
>> Am I understanding correctly?
> 
> Well, I didn't think of that.  But if there's a BCP 47 tag for that, we can
> add an orth file for it, sure.
> 
> 
>> Regards,
>> mpsuzuki
>>
>> Behdad Esfahbod wrote:
>>> Hi everyone,
>>>
>>> Currently fontconfig does not support Unicode variation selectors.  Lets fix that.
>>>
>>> Unicode defines 256 generic variation selectors, in the following ranges:
>>>
>>> U+FE00 VARIATION SELECTOR-1..U+FE0F VARIATION SELECTOR-16
>>> U+E0100 VARIATION SELECTOR-17..U+E01EF VARIATION SELECTOR-256
>>>
>>> OpenType encodes those in cmap subtable format 14.  Fonts as such can encode
>>> pairs of characters in the cmap instead of one.  The second of the pair is
>>> supposed to be a variation selector, though nothing in the table format
>>> enforces that.
>>>
>>> To support these in fontconfig, two changes are needed:
>>>
>>>   - Extend FcCharSet to be able to carry variation sequences as well,
>>>
>>>   - Add a FcFreeTypeCharIndex variant that takes a variation selector. (ala
>>> FT_Face_GetCharVariantIndex),
>>>
>>> Adding the latter is rather trivial.  For the former, it would be easiest if
>>> we encode the variation selector and the base Unicode character in one 32-bit
>>> integer, and make sure FcCharSet handles that efficiently (this probably is
>>> currently not the case).
>>>
>>> Ideally, we'd want to encode the sequence U followed by VSx (where VSx is the
>>> VARIATION SELECTOR-x) as (U + VSx << 24).  This will use the high byte of the
>>> 32bit unsigned for the variation selector number.  The only problem is: there
>>> are 256, not 255, variation selectors.  I submitted a proposal to Unicode to
>>> commit to not use the last one, but that was not accepted.  Currently up to
>>> ~240 are used.
>>>
>>> Failing that most beautiful scheme, we can use a different shift.  21 would be
>>> the next most natural, given that Unicode numbers fit 21 bits.  It just would
>>> be much harder to read a hex of a 32bit number in the FcCharSet verbose output
>>> and know what it means.
>>>
>>> So, what do people think?  Lets make this happen.
>>>
>>> Thanks,
> 


More information about the Fontconfig mailing list