Adding Extension for Experimental Thai Spelling

Michael Stahl mstahl at redhat.com
Mon Feb 13 02:13:22 PST 2012


On 11/02/12 17:23, Richard Wordingham wrote:
> As I understand it, the lack of a usable Thai spell-checker for
> LibreOffice (unlike, say, a Khmer spell-checker) is due to the Thai
> break iterator.  (I had expected Thai and Khmer to face similar
> problems, for neither has a visible word separator and syllable
> boundaries are often unclear in both.)  Tagging Thai script text as
> Khmer does not work (at least, not in Version 3.4.5); the word
> boundaries are still determined by the Thai break iterator.
> 
> Is it possible to create an experimental alternative to the Thai
> break iterator that can be shared with other people as a LibreOffice
> extension? I would be prepared to routinely use U+200B ZERO WIDTH SPACE
> (ZWSP) to separate words in the Thai script, but I suspect Thais would
> not.  Also, I can seem my first useful version fouling up the
> rendering of pre-existing text.  I can't work out how to create a break
> iterator as an *extension*. Could someone please advise me how, e.g. by
> pointing to the documentation or an example.  I can find documentation
> for *publishing* an extension, but that does not address *creating* an
> extension.

hi Richard,

while i don't know anything about break iterators, since OOo 3.0.1 there
is a new grammar checking API, which AFAIK operates on a whole paragraph
at a time; perhaps that API would make implementing a spelling checker
for such languages easier (if LO cannot determine the word boundaries
then the checker can always do it on its own).

http://wiki.services.openoffice.org/wiki/Grammar_Checking
http://www.openoffice.org/lingucomponent/grammar.html

regards,
 michael



More information about the LibreOffice mailing list