[Libreoffice-bugs] [Bug 132493] Offer means to handle import of PDF containing both raster image pages and OCR text

bugzilla-daemon at bugs.documentfoundation.org bugzilla-daemon at bugs.documentfoundation.org
Tue Apr 28 22:26:06 UTC 2020


https://bugs.documentfoundation.org/show_bug.cgi?id=132493

V Stuart Foote <vstuart.foote at utsa.edu> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
             Blocks|                            |99746
                 CC|                            |thb at libreoffice.org,
                   |                            |vmiklos at collabora.com,
                   |                            |vstuart.foote at utsa.edu
             Status|UNCONFIRMED                 |NEW
            Summary|error on opening PDF        |Offer means to handle
                   |                            |import of PDF containing
                   |                            |both raster image pages and
                   |                            |OCR text
           Severity|normal                      |enhancement

--- Comment #1 from V Stuart Foote <vstuart.foote at utsa.edu> ---
The PDF opens fine, the issue is that it had been prepared with OCR of the page
images.

You can remove the OCR by opening in your PDF viewer of choice and then
printing the result back to PDF. Just the page images will be output--none of
the OCR text runs.

Alternatively if you prefer, or need the OCR results--you can do that with
LibreOffice Draw. It is a manual process where by on each page of the imported
PDF you select the source page's image and delete it, leaving the OCR text runs
behind.

But, it would be kind of convenient if the pdf import filter offered methods to
strip out either the image, or the OCR text when both are present.


Referenced Bugs:

https://bugs.documentfoundation.org/show_bug.cgi?id=99746
[Bug 99746] [META] PDF import filter in Draw
-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/libreoffice-bugs/attachments/20200428/662ff4a2/attachment.htm>


More information about the Libreoffice-bugs mailing list