Adding Extension for Experimental Thai Spelling

sungkhum sungkhum at gmail.com
Thu Jul 12 09:50:08 PDT 2012


Thanks for your reply Caolán,
I have submitted a bug and assigned you to it. I really appreciate you
being willing to look into this!
Here's the bug url:
https://www.libreoffice.org/bugzilla/show_bug.cgi?id=52020
Please let me know if there is anything else I can provide. I have a little
working knowledge of ICU, I helped implement the breakiterator for Khmer by
providing the dictionary and tests, but I am not a programmer by trade.

> There was something similar done in the past IIRC to
> pass around soft-page-break information so that export filters could
> know where the layout last put the page breaks. I forget the details of
> that though.

This would be a very useful feature for Cambodians (and I would assume Thai
as well, although Thai tends to have more programs that currently support
wordbreaking already) - would it be best to seek to do this with an
extension rather than LibreOffice core?

Thanks again for your time,
Nathan


On Thu, Jul 12, 2012 at 11:10 PM, Caolán McNamara [via Document Foundation
Mail Archive] <ml-node+s969070n3995127h32 at n3.nabble.com> wrote:

> On Sun, 2012-07-08 at 08:08 -0700, sungkhum wrote:
> > I have two questions: is there a way to have the LibreOffice spelling
> > checker (Hunspell) also recognize word-breaks using the ICU break
> iterator
> > for Khmer so that Cambodians no longer have to add zero-width spaces
> > manually (as it seems to work for Thai now?)? Currently, lines without
> > zero-width spaces are seen as one long word to the spelling checker in
> > LibreOffice 3.6. But since the line-breaking is working, it would seem
> > breaking words for the spelling checker should also be able to work.
> Should
> > I submit a bug? How should I proceed?
>
> Sounds like a bug really. I mean, hunspell itself generally doesn't do
> the parsing of text into words, the app gives each word to hunspell. And
> we're *supposed* to be using the icu breakiterator to split words. I
> suspect its a similar bug as this original one.
>
> So... sure, file a bug, assign it to me ([hidden email]<http://user/SendEmail.jtp?type=node&node=3995127&i=0>)
> and paste a
> short two word example text into the bug and indicate where the word
> break should be and I'll add a regression test for it and see if its a
> trivial fix for Khmer too now that we're using the latest-and-greatest
> icu.
>
> > Also, since many other programs do not incorporate ICU's code, is there
> a
> > way to make the line breaks "real" when a document is saved in another
> > format (such as a .doc?). And by "real" I mean that a zero-width space
> is
> > actually added to the text where a line-break should be.
>
> That should at least be theoretically possible, albeit a bit tricky
> seeing as the layout code is the bit that knows the width of the page
> and does the line breaking, while the export filters don't get to know
> that information. There was something similar done in the past IIRC to
> pass around soft-page-break information so that export filters could
> know where the layout last put the page breaks. I forget the details of
> that though.
>
> C.
>
> _______________________________________________
> LibreOffice mailing list
> [hidden email] <http://user/SendEmail.jtp?type=node&node=3995127&i=1>
> http://lists.freedesktop.org/mailman/listinfo/libreoffice
>
>
> ------------------------------
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://nabble.documentfoundation.org/Adding-Extension-for-Experimental-Thai-Spelling-tp3735637p3995127.html
>  To unsubscribe from Adding Extension for Experimental Thai Spelling, click
> here<http://nabble.documentfoundation.org/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=3735637&code=c3VuZ2todW1AZ21haWwuY29tfDM3MzU2Mzd8LTE3NzAzNTQxNDk=>
> .
> NAML<http://nabble.documentfoundation.org/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>


--
View this message in context: http://nabble.documentfoundation.org/Adding-Extension-for-Experimental-Thai-Spelling-tp3735637p3995138.html
Sent from the Dev mailing list archive at Nabble.com.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/libreoffice/attachments/20120712/d6f64706/attachment-0001.html>


More information about the LibreOffice mailing list