[Poppler-bugs] [Bug 92449] New: pdftohtml ignore png format option and extract inverted jpg images
bugzilla-daemon at freedesktop.org
bugzilla-daemon at freedesktop.org
Tue Oct 13 09:07:25 PDT 2015
https://bugs.freedesktop.org/show_bug.cgi?id=92449
Bug ID: 92449
Summary: pdftohtml ignore png format option and extract
inverted jpg images
Product: poppler
Version: unspecified
Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
Severity: major
Priority: medium
Component: pdftohtml
Assignee: poppler-bugs at lists.freedesktop.org
Reporter: kislicynda at gmail.com
Hi all,
I use pdftohtml 0.37.0 on Ubuntu.
When I call
pdftohtml -xml -fmt png
command - some images are extracted as .jpg (all with inverted colors) and some
as .png (all with normal colors).
When I call
pdfimages -all test.pdf test
command - I get same result for images (inverted .jpg and normal .png).
But when I call
pdfimages -png test.pdf test
command - I get only .png images and all of it has normal colors.
Questions:
1. Is it possible to convert pdf to html/xml using pdftohtml utility with
export all images to .png? Or at least to have non-inverted .jpg images?
Because now I need to call 2 different commands for same pdf page to get
correct result? It seems that `-fmt` option doesn't work
2. if using `pdfimages -all test.pdf test` command first image is extracted as
.jpg and second as .png - does it mean that first image is actually stored in
JPG format in pdf? and same for second image?
3. is it ok, if exported via `pdftohtml -xml` image has one resolution
(width-height), but another inside generated xml? for example, file has
width=145, height=145, but inside xml it has width=105, height=105?
PS: I can attach pdf file if needed
Thanks in advance,
--
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/poppler-bugs/attachments/20151013/dd1288e1/attachment.html>
More information about the Poppler-bugs
mailing list