Tagging text as being in arbitrary complex-script languages

Richard Wordingham richard.wordingham at ntlworld.com
Fri Apr 19 02:32:34 UTC 2019


On Thu, 18 Apr 2019 20:40:01 +0100
Richard Wordingham <richard.wordingham at ntlworld.com> wrote:

> On Thu, 18 Apr 2019 12:25:11 +0200
> Eike Rathke <erack at redhat.com> wrote:

> > Though with sa-Latn
> > I doubt there's a use case, so I wouldn't call that "correct" in
> > common sense.  
> 
> So how do you suggest we tag Sanskrit in Latin script?

In answer to what was intended to be a rhetorical question, I suppose
und-Latn-t-sa-m0-iast and und-Latn-t-sa-m0-iso would work for the
normative forms. I've successfully loaded a mocked up extension for the
former (as explicitly using a Western script), though I don't much like
the consequent tagging <style:text-properties ... fo:language="und"> in
the document's content.xml. That's a problem with the 't' extension.
Transliteration may change the language of place names in isolation,
but it doesn't really change the language of paragraphs of text.

Richard.


More information about the LibreOffice mailing list