[Libreoffice-bugs] [Bug 113298] RTL: Automatic language detection based on keyboard layout

bugzilla-daemon at bugs.documentfoundation.org bugzilla-daemon at bugs.documentfoundation.org
Mon Oct 23 18:37:33 UTC 2017


https://bugs.documentfoundation.org/show_bug.cgi?id=113298

Yousuf Philips (jay) <philipz85 at hotmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |kaplanlior at gmail.com

--- Comment #4 from Yousuf Philips (jay) <philipz85 at hotmail.com> ---
(In reply to Caolán McNamara from comment #3)
> I imagine using libexttextcat would just introduce a pile of "my language
> was guessed wrong" bugs. Especially for short sequences of text which won't
> be long enough for the statistical efforts of libexttextcat to guess it
> right.

Have you seen this library - https://github.com/CLD2Owners/cld2

> Unicode char range folds this bunch of languages
> https://en.wikipedia.org/wiki/
> Arabic_script#Languages_currently_written_with_the_Arabic_alphabet to
> Arabic, while Hebrew script munges Yiddish and Hebrew together, which is
> maybe acceptable loss and probably happens on Windows already.

For arabic alphabet languages, LO only lists persian, uyghur, punjabi and urdu
under CTL and there are unicode characters that are unique to most of these
languages.

https://en.wikipedia.org/wiki/Persian_language#Additions
https://en.wikipedia.org/wiki/Urdu_alphabet#Differences_from_Persian_alphabet
https://en.wikipedia.org/wiki/Shahmukhi_alphabet

@Lior: what is your take on Hebrew detection?

> There are some hints in bug 108151 about some available fields in the gtk
> integration with the IBUS IM that might be of some use to pick an acceptable
> value to set for the language.

Guessing based on locale is definitely helpful to some degree if a user lives
in a country that a particular language is highly used in.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/libreoffice-bugs/attachments/20171023/70cf962d/attachment.html>


More information about the Libreoffice-bugs mailing list