[Libreoffice-bugs] [Bug 113977] Implement REGEXEXTRACT function

bugzilla-daemon at bugs.documentfoundation.org bugzilla-daemon at bugs.documentfoundation.org
Sat Nov 25 20:05:56 UTC 2017


https://bugs.documentfoundation.org/show_bug.cgi?id=113977

--- Comment #4 from Dan Dascalescu <ddascalescu+freedesktop at gmail.com> ---
> At first sight it looks like a proper enhancement, one function that
> with regex can extract the most wonderful parts out of a text string.

That's what I was going for, thank you.

> However I see some aspects that reduces its usefulness:
> -regular expressions are not everybody's cup of tea.
> This makes the function only useful to a very small group of users;

A lot of users are familiar with wildcards, e.g. *.jpg. Many regexp tasks are
solvable with just "foo.*bar". Anyway, by the logic "a function only serves
some users therefore we shouldn't have it", none of the trigonometry functions
should be in LibreCalc. For example, I've never used "arctangent" in my life.
Or "BAHTTEXT". Wish I had numbers to show that far more Google Docs users use
REGEXEXTRACT than BAHTTEXT.

> -Calc has a setting (in preferences) in which wild cards or regular 
> expressions can be selected to be used in function arguments. This request is 
> limited to regular expressions only and thereby somewhat conflicts with the 
> wild card/regex setting;

It "conflicts" only as much as the regex pattern for existing functions that
use regexps, like SEARCH, conflicts. We can solve this problem the same way we
solved the SEARCH problem.

> -As the function is neither in ODFF nor in Excel, it will be incompatible
>  with other applications where LibreOffice strives for optimal 
> interoperability with other applications;

Do we limit formatting because it's not compatible with CSV?

1. If a document author wants to provide backwards compatibility with
applications that don't support a given function, they won't use that function. 

2. What other applications exactly are we talking about? Only Excel has
significant market share, because Google Sheets supports REGEXEXTRACT.

3. Innovation requires breaking compatibility. By providing a new function, we
enable users to force the developers of their applications to step up and
provide the same function. This very process is at work now as I'm adovating
for supporting a function from Google Sheets...

4. ...which is reason #4 why we should implement that function - so Google
Sheets that use it could be opened in LibreCalc.


> -Calculations (including those on text strings) tend to look very complex 
> when the calculation goes further than just one simple calculation. This can > be avoided by using more than one step

I prefer not to litter my spreadsheet with cells that contain intermediate
values. In my example with web service-based currency conversion (BTW a very
common task that we don't have a good solution for - see bug 113974), the
string "EUR:1.2,JPY:0.02,AUD:0.9,..." would require 2 intermediate cells per
each currency. This would trade off formula complexity (usually invisible and
forgotten about once you've figured out the formula) for the visible complexity
of the intermediate cells (which, yes, could be hidden as well). Anyway,  I'd
love to have one clean function that can extract exactly what I need into the
desired cell.

> Creating new functions that combine other functions will only be useful if 
> there is a common need for them.

I think Google Sheets shows there's a need for REGEXEXTRACT...

> In programming it is the same: with some standard functions an almost 
> infinite amount of functionalities can be programmed; only functionality that 
> is widely used is put into a separate function.

... and so do all modern programming languages:

"match" in JavaScript
"search" in Python
"matcher" in Java
etc.

> -A macro can provide the same functionality and be exactly suited to the 
> user's need.

I would say a lot more users are comfortable using functions already provided
by Calc/Excel/Sheets, than creating macros.


> The enhancement request is clear, so I set the status to new.
> @Dan : should you wish to do this yourself, LibreOffice has a mentor for 
> those new to developing for LibreOffice. I can help you with some code 
> pointers as well.

Thank you. I hope to have time to tackle this on next year, though for now I've
just been using a long formula, so my motivation has decreased somewhat.

> IMHO A7 should be A1.

Correct :) Which proves how brittle that sort of complex MID+SEARCH expression
is.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/libreoffice-bugs/attachments/20171125/62f6282d/attachment.html>


More information about the Libreoffice-bugs mailing list