[HarfBuzz] New Indic standard?

Jonathan Kew jonathan at jfkew.plus.com
Fri Aug 21 08:36:51 PDT 2009


On 21 Aug 2009, at 16:18, Ed Trager wrote:

> Hi, Behdad and everyone,
>
>> Something as simple as:
>>
>> INPUT: U+1234,U+5678 # some comment
>> FONT: Some Font Name 24
>> OUTPUT: <1,0,0>,<5,10,30>
>>
>> Where the output tuples are glyph id, X and Y.
>> Something like that can be processed into whatever format we end up  
>> adopting
>> later.
>
> I have an interest in helping out with aggregating / compiling /
> constructing an Indic scripts test data set along these lines ...
>
> ... whether I really have time to do it is another matter :-)
>
> But at this juncture, I'll just ask the questions first.  Compiling
> the INPUT data is straight-forward and I have no questions about that
> -- although it still may take some time to achieve a suitable degree
> of comprehensiveness.
>
> However obtaining the OUTPUT data will require processing through some
> program.  A fairly simple command-line utility linked to a shaping
> engine is, in theory, all that is required.

Only if you have a shaping engine that *defines* the standard, which I  
don't believe we do.

>
> But my first question is: What shaping engine do we consider as the
> "Gold Standard" for correct processing for Indic scripts?  In other
> words, if I or someone else sits down to write such a utility program,
> should said program use Graphite, ATSUI / AAT , or Uniscribe as the
> "Gold Standard" shaping engine?

No. :)

First, it seems as though you're mixing a couple of issues here. Our  
primary concern at the moment (I think) is to be able to verify the  
processing of OpenType Indic fonts, which means checking that the  
correct shaping logic is being applied and that the font tables are  
being interpreted correctly. You can't test that by comparing with  
Graphite or AAT results, because they use entirely different tables,  
and there is no guarantee that the OpenType, Graphite, and AAT tables  
(assuming all are available) actually express the exact same behavior.

So (in this context) you have to consider OpenType engines only.  
Uniscribe is obviously the de facto "standard", but given that its  
behavior changes from release to release, that exhaustive  
documentation of precisely what it is supposed to do is not available,  
and that there have certainly been times, at least, when it has not  
completely conformed to the published OpenType specs, I don't think it  
should be taken as a "Gold Standard" to which everything else should  
conform. It is useful as *a* benchmark for comparsion, but when a  
deviation is found, it is essential to go back to the spec and try to  
determine the correct result, not simply assume "Uniscribe is right,  
by definition".

There will be times, no doubt, when the spec is unclear; in that case,  
it needs to be discussed on the OpenType list (and/or other  
appropriate venues) and clarified. Obviously, where the spec is vague  
but can be interpreted as allowing Uniscribe's behavior, that is  
likely to be accepted as the "right" result.

For the (relatively rare) situation where a font provides tables for  
multiple layout technologies, it would indeed be useful to have tools  
based on all of OpenType, Graphite, and AAT, and the ability to  
compare the results. This would primarily be useful to the font  
developer, though, rather than to shaping engine implementers.

>
> If the answer is "Uniscribe", then must one use the latest version of
> Uniscribe in Vista or Windows 7 ? Would Windows 7 be better just
> because Vista as an OS is such a dog?
>
> My personal bias, for several reasons, would be to just use Graphite.
> Would anyone recommend or object to the idea of writing such a utility
> using Graphite?
>
> And of course if such a utility, or something close to it, already
> exists, then where can I get the code?

I believe there are some tools in the Graphite project already that  
could be used to do this, though I don't remember the details. But as  
I've tried to argue above, that's not really a useful "standard" for  
testing an OpenType shaping implementation. You'd end up primarily  
testing your ability to write Graphite code that has the exact same  
meaning as OpenType tables + shaping rules, rather than testing the  
shaping implementation itself.

JK




More information about the HarfBuzz mailing list