[Fontconfig] Can we use base 16, and not 85, for ASCII charset representations?

W. Trevor King wking at tremily.us
Sat Sep 28 13:37:58 PDT 2013


On Wed, Sep 25, 2013 at 11:25:16AM +0900, Akira TAGOH wrote:
> On Wed, Sep 25, 2013 at 1:22 AM, W. Trevor King <wking at tremily.us> wrote:
> > So you think the:
> >
> >   <page>: <mask>
> >
> > syntax is too verbose in hex?  I can't think of a good alternative off
> 
> Yes. it depends on how many glyphs a font has though, several tens of
> thousand hex code as the output is too much.

I think we are miscommunicating here ;).  I was proposing input/output
be in the

  <base-code-point>: <mask>
    for example:
  00002100: 366af26f 03e040d5 fff88000 ffffffff 0fff001f 30004140 03fff803 00000000

syntax [1], not in Behdad's

  {<code-point>, …}
    for example:
  {00002100, 00002101, 00002102, …}

syntax.  I think that supporting '{<code-point>, …}' as an alternative
input format is also a good idea, but it might be too verbose for
output (taking over 28 times as many characters for dense pages [2]).
On the other hand, the '{<code-point>, …}' syntax is a lot easier to
parse pragmatically.  I'd like to use 'charset' for '{<code-point>,
…}', and *not* display it by default.  Instead, I'd add a new key
('map') that uses the '<page>: <mask>' format, and use that as the
default output.

Of the two formats, I think the '{<code-point>, …}' format is more
readable in 'fc-list [pattern] {element ...}' mode, despite the longer
lines you're going to get.  It seems like the element output is
restricted to a single line per match, and I'd rather have all 22447
greppable characters in:

  :charset={00000020,00000021,00000022,00000023,00000024,00000025,00000026,00000027,00000028,00000029,0000002a,0000002b,0000002c,0000002d,0000002e,0000002f,00000030,00000031,00000032,00000033,00000034,00000035,00000036,00000037,00000038,00000039,0000003a,0000003b,0000003c,0000003d,0000003e,0000003f,00000040,00000041,00000042,00000043,00000044,00000045,00000046,00000047,00000048,00000049,0000004a,0000004b,0000004c,0000004d,0000004e,0000004f,00000050,00000051,00000052,…redacted for sanity…,0000fb05,0000fb06,0000fffd}}

than the more compact but complicated 2299 characters in [3]:

  :map=00000000 [ 00000000 ffffffff ffffffff 7fffffff 00000000 ffffffff ffffffff ffffffff ]|00000100 [ ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ]|00000200 [ ffffffff ffffffff ffff000a ffffffff ffffffff ffffffff ffffffff 0000701f ]|00000300 [ ffffffff ffffffff ffff7fff 7c30ffff ffffd7f0 fffffffb ffff7fff eb7f0003 ]|00000400 [ ffffffff ffffffff ffffffff f0ffffff fffff008 ffffffff ffff1f9f 03ffffff ]|00000500 [ 00000000 00000000 00000000 00000000 00000000 ffff0000 ffff004f 001f07ff ]|00001d00 [ 00200000 00000000 0dcde798 00000000 10000000 08000001 00000000 00000000 ]|00001e00 [ ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ]|00001f00 [ 3f3fffff ffffffff aaff3f3f 3fffffff ffffffff ffdfffff efcfffdf 7fdcffff ]|00002000 [ ffff0fff 7eff80ff 00008f94 fff30000 1fff7fff 0002999c 00000000 00000000 ]|00002100 [ 366af26f 03e040d5 fff88000 ffffffff 0fff001f 30004140 03fff803 00000000 ]|00002200 [ ffffbfff 10400ff8 02000322 0003cc37 01e0003c 0000007c 00000020 0000c000 ]|00002300 [ 0001000d 00000ec3 00000000 20000000 00000001 00000000 00080000 00000000 ]|00002400 [ 00000000 00000008 00000000 ffffffff 000000ff ffc00000 ffffffff ffffffff ]|00002500 [ 00000000 00000000 00000000 00000000 00000000 30cc0003 00ffcec3 00000040 ]|00002600 [ 4a000020 fe008080 000fffff 00001e69 01200000 0000203c 00000000 00000000 ]|00002700 [ 00000000 00000000 00000000 ffc00080 00000000 00000000 00000004 000000c0 ]|00002c00 [ 00000000 00000000 00000000 00f01fff 00000000 00000000 00000000 00000000 ]|00002e00 [ 0180073c 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ]|0000a700 [ 00000000 00000003 00000000 00000000 00000000 00000000 00000000 00000000 ]|0000e000 [ 8fff8fff 120073ff ffffff07 ffff2fff ff87ffff f07fffff 0639d47f 0a7fff00 ]|0000e100 [ ff0000f3 00010703 7fbffe00 0000003f 00000000 00000000 00000000 00000000 ]|0000e300 [ 00000000 00000000 ff000000 0000001f 00000000 00000000 00000000 00000000 ]|0000e400 [ 00000000 00000001 00000000 00000000 00000000 00000000 00000000 00000000 ]|0000f600 [ 00000000 01000000 00000000 00000000 00000000 47ffc000 00000000 00000000 ]|0000fb00 [ 0000007f 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ]|0000ff00 [ 00000000 00000000 00000000 00000000 00000000 00000000 00000000 20000000 ]

After all, I don't see a use case for storing all of this output
(which is verbose either way) in anything other than a temporary file
or the buffer of a pager or script.  And in that case, who cares about
a few megs?

Thoughts?
Trevor

[1]: The 'map' syntax is currently used in 'fc-list -v', except that
  the implementation truncates the base code point.
[2]: 256 / 9 = 28.44…
[3]: I tweaked the map format a bit to try and preserve *some* visual
  clarity while avoiding colons and newlines.

-- 
This email may be signed or encrypted with GnuPG (http://www.gnupg.org).
For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://lists.freedesktop.org/archives/fontconfig/attachments/20130928/89e483f4/attachment.pgp>


More information about the Fontconfig mailing list