Adding Languages to Writer's Character, Font Menu

Richard Wordingham richard.wordingham at ntlworld.com
Tue Jun 30 14:18:46 PDT 2015


On Tue, 30 Jun 2015 17:48:05 +0200
Eike Rathke <erack at redhat.com> wrote:

> On Monday, 2015-06-29 20:40:46 +0200, Khaled Hosny wrote:

> > We already handle this at the text shaping level in VCL for
> > platforms where HarfBuzz is used.
 
> I think we talk about two different things here.

Yes.  Khaled and I are focused on handling text, whether fundamentally
present or generated by field codes and the like.  What you are talking
of makes most sense for when there is no relevant user-input text. 

> My view is from
> correct language tag attribution that we need anyway, for document
> storage

I don't understand that one.

> and spell-checkers

Seems to work for 'unsupported' nod-TH.  Tai Tham script is encountered,
identified as complex (as demonstrated by the choice of font), so
language nod-TH and corrected using the nod-TH spelling dictionaries.  

(Mind you, they're only populated as nod-Lana-TH.  The fun starts when
we want to distinguish what might be called nod-Thai-TH-etymological,
nod-Thai-TH-Chiangmai and nod-Thai-TH-Chiangrai.)

> and locale dependent representation.

Presumably for generated text.  Yes, here language and country will in
general be inadequate.

> When
> I mention "language tag" I'm always talking about BCP 47 language
> tags. You, and possibly Richard, have the runtime view and what could
> be automatically detected. So, even if detected automatically we'll
> have to assign a language tag that for the non-default script of a
> language includes the ISO 15924 script code.

> <snip> arbitrary "Western"/CTL/CJK classification <snip>

> The correct route to go is probably to
> assign known scripts to these classes, whether detected automatically
> or not,

Which is already being done, though conceivably going directly from
character to class.

> and distribute language tags according to their (implied or
> not) script over those classes.

I'm not sure I follow you here.  A supported language tag will have
corresponding strings for automatically generated text, and these
strings will generally imply the font.  The only exception I can think
of is common script text, where perhaps script information will be
required to select the styling.  This just requires a default script
for each supported language code (i.e. minimal BCP 47 tag), though we
could get away with default script class.

Richard.


More information about the LibreOffice mailing list