[Libreoffice-ux-advise] [Bug 156507] Ability to remove non-printing/"atypical" characters in a stretch of text

bugzilla-daemon at bugs.documentfoundation.org bugzilla-daemon at bugs.documentfoundation.org
Fri Aug 25 02:32:35 UTC 2023


https://bugs.documentfoundation.org/show_bug.cgi?id=156507

V Stuart Foote <vsfoote at libreoffice.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           See Also|                            |https://bugs.documentfounda
                   |                            |tion.org/show_bug.cgi?id=32
                   |                            |249

--- Comment #12 from V Stuart Foote <vsfoote at libreoffice.org> ---
(In reply to Eyal Rozenberg from comment #11)
> (In reply to ⁨خالد حسني⁩ from comment #10)
> 
> They understand enough to know they're getting some junk in addition to the
> text they want; keyboard traversal is weird, and the "junk" may be affecting
> rendering behavior.
> 
> Just because it's easier if users simply not do this, does not mean that
> they don't (or that they shouldn't). You could argue that there is no
> reasonable way this could work; perhaps, but - I very much doubt it. While
> we have not demonstrated that there can be no reasonable choice of
> characters to filter (say, a language-specific or locale-specific choice) -
> we should entertain the possibility that it does exist. And assuming it
> exists - I believe should offer it, to make this kind of work easier for
> users.

With <Ctl>+F10 exposing NPC, an <Alt>+X toggle will show Unicode for the
specific NPC at the text cursor--and then toggle it back.  Then knowing the
Unicode, it is trivial to find/delete (or edit) via Find-Replace dialog.  It is
not dynamic (requiring linear progression of codepoints being removed from the
text) but it is already functional.

Otherwise I don't see a need for providing a new dialog as a core capability as
it is very much a corner case, and dev effort is not justified. 

More on point would be dev work to complete the residual NPC toggle exposure
from bug 58434

And if the use case is only for parsing PDF--a new Writer paragraph oriented
"reflow" implementation, replacing Justin's text box 'combine' based PDF import
done for bug 32249, is really the ask. 

And implementing a new PDF parser/reflow would be the opportunity to
selectively clean up any NPC or malformed text runs from PDF source.  While
completing the per word /ActualText support of bug 117428 would move quality of
our PDF exports (optionally as it has a real performance and size cost). 

Either dupe this to bug 32249, or more simply it becomes => WF

-- 
You are receiving this mail because:
You are on the CC list for the bug.


More information about the Libreoffice-ux-advise mailing list