[Poppler-bugs] [Bug 97399] New: No word splitting for pdfs produced by Chrome

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Thu Aug 18 17:34:18 UTC 2016


https://bugs.freedesktop.org/show_bug.cgi?id=97399

            Bug ID: 97399
           Summary: No word splitting for pdfs produced by Chrome
           Product: poppler
           Version: unspecified
          Hardware: Other
                OS: All
            Status: NEW
          Severity: normal
          Priority: medium
         Component: utils
          Assignee: poppler-bugs at lists.freedesktop.org
          Reporter: buktop999 at gmail.com

Created attachment 125884
  --> https://bugs.freedesktop.org/attachment.cgi?id=125884&action=edit
pdf produced by Chrome

When using "pdftotext -bbox" on PDFs produced by Chrome's page print, the
sentenses are not splitted to words. In the pdftotext's output symbols 0xA0
are in between of words instead of spaces (0x20). That might be the reason of
sentense not being splitted.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/poppler-bugs/attachments/20160818/cf5fd45b/attachment.html>


More information about the Poppler-bugs mailing list