[Libreoffice-bugs] [Bug 131487] Words whose characters span multiple languages should not undergo spell checking

bugzilla-daemon at bugs.documentfoundation.org bugzilla-daemon at bugs.documentfoundation.org
Thu Jun 25 14:42:45 UTC 2020


https://bugs.documentfoundation.org/show_bug.cgi?id=131487

--- Comment #7 from Mihkel Tõnnov <mihhkel at gmail.com> ---
(In reply to Mihkel Tõnnov from comment #6)

Ugh, I messed up examples in my first paragraph while moving things around
there. It should read like this:

In Estonian, foreign words should be written in italics and if there's a case
ending, then it has to be separated by an apostrophe, e.g. "<i>status
quo</i>'ni". Case endings by themselves are not valid words, so apostrophe as
word separator definitely wouldn't help; and apostrophe is also used for other
purposes, like to indicate emission of character(s) from a word (e.g. to
imitate everyday speech).

Somewhat similarly, hyphens can be used to separate foreign part and native
part in compound words, e.g. "<i>flamenco</i>-tantsija", "tele-<i>show</i>".
(Note that hyphen separates complete words, while apostrophe is used with case
endings.)

I'm not sure it would be beneficial to completely ignore such words in
spellcheck, as any misspellings in them would then pass unflagged (and at least
some foreign words are rather prone to misspellings - take the Italian coffee
terms that the rest of the world often struggles to write correctly :)

Could the request here maybe be re-purposed and implemented as follows?

1) If a word (as currently detected by LibO) contains characters in multiple
languages, check if there is some punctuation mark separating the different
language parts.

2a) If there's an apostrophe (' or ’ - but probably not ‘): check which
language the apostrophe belongs to, and as far as spellcheck is concerned,
separate the "word" at the border of languages, keeping the apostrophe together
with the preceding or following characters, as appropriate.

2b) If there's a hyphen: ignore the hyphen and spellcheck the word parts as if
separated by space.

2c) If there's no separating character, underline the whole "word" as
misspelled.

@Sergio: you said that "quell'" alone is not an Italian word - how is it
currently handled by spellcheck, if used before an Italian word? Would 2a as
described above work for Italian?

Also, I'm not sure if 2c would be OK for all languages, though - does anyone
have counterexamples for this?

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/libreoffice-bugs/attachments/20200625/884a8178/attachment.htm>


More information about the Libreoffice-bugs mailing list