[Libreoffice-bugs] [Bug 140708] New: The REGEX function accepts all (ismx) but one (w) flags and only directly in the regular expression and does not allow all matches to be found at once
bugzilla-daemon at bugs.documentfoundation.org
bugzilla-daemon at bugs.documentfoundation.org
Sun Feb 28 08:11:30 UTC 2021
https://bugs.documentfoundation.org/show_bug.cgi?id=140708
Bug ID: 140708
Summary: The REGEX function accepts all (ismx) but one (w)
flags and only directly in the regular expression and
does not allow all matches to be found at once
Product: LibreOffice
Version: 7.0.4.2 release
Hardware: All
OS: All
Status: UNCONFIRMED
Severity: enhancement
Priority: medium
Component: Calc
Assignee: libreoffice-bugs at lists.freedesktop.org
Reporter: eeigor at inbox.ru
Description:
We have:
REGEX(Text;Expression[;[Replacement][;Flags|Occurrence]])
Flag settings: "g" only (means "Global")
Desirable:
REGEX(Text;Expression[;[Replacement][;Flags][;Occurrence]])
Flag settings: "g" + "ismxw"
Flag Settings - Description
i - Ignore case (case insensitive)
s - Make . match newline too (single-line, dot all)
m - Make begin/end {^, $} consider each line
x - Allow comment in regex
w - Make {\w, \W, \b, \B} follow Unicode rules
Steps to Reproduce:
See "Actual Results".
Actual Results:
1. Either the first occurrence or the given one is extracted. Now if the
replacement parameter is not specified, the flag "g" is ignored.
2. All flags (ismx) work if you insert them directly into a regular expression:
"(?ismx)…" or "(?ismx:…)" when the corresponding option is enabled. Except for
one (w).
3. Flag "w". E.g.:
=REGEX("The quick (""brown"") fox can’t jump 32.3 feet,
right?";"(?w)\b\w+\b";;5)
returns "jump", not "can't". Why?
Expected Results:
1. When the "g" flag is set, all occurrences should also be returned.
Parameters "Flags|Occurrence" should be isolated.
2. Flag settings: "g" + "ismxw"
3. Word boundaries are recognized as in the example above according to the
specification
(https://www.unicode.org/reports/tr29/tr29-33.html#Word_Boundaries).
Reproducible: Always
User Profile Reset: No
Additional Info:
The use of the "w" flag remains unclear. For example, words with an accent in a
word are recognized with the "w" flag disabled (?-w), and the examples of the
words above are not recognized at all.
--
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/libreoffice-bugs/attachments/20210228/0255c02e/attachment-0001.htm>
More information about the Libreoffice-bugs
mailing list