[Libreoffice-bugs] [Bug 140708] The REGEX function accepts all (ismx) but one (w) flags and only directly in the regular expression and does not allow all matches to be found at once

bugzilla-daemon at bugs.documentfoundation.org bugzilla-daemon at bugs.documentfoundation.org
Mon Apr 12 13:31:47 UTC 2021


https://bugs.documentfoundation.org/show_bug.cgi?id=140708

Eike Rathke <erack at redhat.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |erack at redhat.com

--- Comment #3 from Eike Rathke <erack at redhat.com> ---
You are confusing the parameter Flags with pattern option flags. The ixsmw
option flags can always be given in the pattern (like you did with (?-w) in
your example) and there can be multiple options at different places, it does
not make sense to have those repeated as function-wide flags. The Flags
parameter currently implements only the "g" Global argument as known from sed
for replacements. Maybe Flags should be renamed to not be confused. It was
never meant to have pattern option flags be passed in the Flags parameter or
have this "g" act on extraction.

I'd find it doubtable to have REGEX("string";".";;"g") extract every single
character of "string", or the result of REGEX("barbaz";"a";;"g") be "aa".

For your question about word boundaries I can only refer to ICU and its
documentation, http://userguide.icu-project.org/strings/regexp or new
https://unicode-org.github.io/icu/userguide/strings/regexp.html
If unclear please ask them.
The "accents have shifted when pasting text" indicates you used combining
accents instead of single character Unicode letters (and indeed that's what one
gets when copying the sample string from the comment), that may be related and
might explain why in your example the second occurrence of a word is the one
letter. Again, to be sure I'd suggest you ask in an ICU or Unicode forum or
mailing list.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/libreoffice-bugs/attachments/20210412/6f3ad087/attachment.htm>


More information about the Libreoffice-bugs mailing list