[Bug 165931] Regular expressions must be able to match non-break line endings
bugzilla-daemon at bugs.documentfoundation.org
bugzilla-daemon at bugs.documentfoundation.org
Sat Mar 29 09:59:23 UTC 2025
https://bugs.documentfoundation.org/show_bug.cgi?id=165931
--- Comment #14 from Eyal Rozenberg <eyalroz1 at gmx.com> ---
(In reply to László Németh from comment #12)
> I just realized that the layout regex with an
> extended syntax seems highly useful for my upcoming typography developments:
> adding 4 different regex layout boundary marks for 1) line end, 2) column
> end 3) page end and 4) spread end, and a 5th mark for the hyphenation, e.g.
> (maybe used by ICU regex) \L, \C, \P, \S and \H. Their usage in the regex
> pattern in Find&Replace enables the layout text export automatically instead
> of the recent one (document model).
That's a pleasing compromise between what we have now and the behemoth project
of actual document-structural search. It's also generally in line with the
approach of MSO.
An important point to note is the distinction between breaks-of-things and
ends-of-things. I focused on paragraphs, but if you're looking at pages or
columns - sometimes you flow into the next column or page, sometimes you
manually break.
Also, in regular expressions, there are marks for the beginning the end of the
unit of content you're allowed to search, which right now is a paragraph; and
there isn't a generic boundary for those (unlike words, where we do have word
boundaries).
I liked your examples... although there is a nitpick w.r.t. the use of \n; see
bug 108256; it doesn't match ends-of-paragraphs, not paragraph breaks, only
line breaks, for now.
Another idea in the same vein as your examples: Search for mid-paragraph lines
with too few characters before the non-break line end, i.e. too much
justification space:
(^|\L).{,40}\L
(using your syntax for a line break.)
> It's not clear yet, this is the best way to solve my problems, but I like
> the freedom it would give us to analyse and adjust typography.
>
> "The best fixes are the ones you get for free by fixing something else :-)"
Reminds me of the adage about the best programs being the ones you write while
working on something :-)
All that being said, I suggest that before you tighten all the bolts on an
implementation here, and in respect to Mike's opinion, someone mention this at
the ESC and/or the weekly design meeting, for people to possibly get mad and
sound alarms if they want to...
And - if you start working on this, please also have a look at the other bugs
in the similar vein blocking the meta-bug which might get automatically
resolved by your approach.
--
You are receiving this mail because:
You are on the CC list for the bug.
More information about the Libreoffice-ux-advise
mailing list