[Libreoffice-bugs] [Bug 91192] AutoCorrect: Writer not recognizing a URL's trailing carat, hash mark, question mark, backslash, or pipe
bugzilla-daemon at bugs.documentfoundation.org
bugzilla-daemon at bugs.documentfoundation.org
Thu Feb 4 17:52:08 UTC 2021
https://bugs.documentfoundation.org/show_bug.cgi?id=91192
Stephan Bergmann <sbergman at redhat.com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |sbergman at redhat.com
--- Comment #17 from Stephan Bergmann <sbergman at redhat.com> ---
The code that guesses which part of a larger text shall be auto-detected as a
URI is URIHelper::FindFirstURLInText (svl/source/misc/urihelper.cxx, containing
detailed documentation). Of necessity, it needs to apply some heuristics, and,
also of necessity, the algorithm's outcome will not necessarily match any given
user's exact expectations. That said:
(In reply to sdc.blanco from comment #12)
> Asking for UXEval: Two questions.
>
> 1. Is it a considered a "bug" a potential URL that ends with # (or ?) does
> not include the # (or ?) in the URL recognition?
>
> (but, as noted, no problem if text follows # or ? )
Especially with "?" (and similar to e.g. "," and "."), the heuristics
conservatively try to avoid including trailing punctuation (for which it is
assumed that it was not meant to be part of the URI).
> 2. Is it a problem that the three characters: ^ | \ are not recognized as
> part of a URL (and URL recognition stops with these characters)?
>
> Relevant to note that these three characters are considered "unsafe" and
> should have percent-encoding ( https://www.ietf.org/rfc/rfc1738.txt )
That's not a "should" but a "must". None of those three characters can appear
in a URI as-is, they always need to be percent-encoded. The used heuristics in
general do not consider that a character that cannot appear in a URI would form
part of a to-be-detected URI.
--
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/libreoffice-bugs/attachments/20210204/e5f9ffee/attachment.htm>
More information about the Libreoffice-bugs
mailing list