<html>
<head>
<base href="https://bugs.freedesktop.org/" />
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW - pdftohtml ignore png format option and extract inverted jpg images"
href="https://bugs.freedesktop.org/show_bug.cgi?id=92449">92449</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>pdftohtml ignore png format option and extract inverted jpg images
</td>
</tr>
<tr>
<th>Product</th>
<td>poppler
</td>
</tr>
<tr>
<th>Version</th>
<td>unspecified
</td>
</tr>
<tr>
<th>Hardware</th>
<td>x86-64 (AMD64)
</td>
</tr>
<tr>
<th>OS</th>
<td>Linux (All)
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>major
</td>
</tr>
<tr>
<th>Priority</th>
<td>medium
</td>
</tr>
<tr>
<th>Component</th>
<td>pdftohtml
</td>
</tr>
<tr>
<th>Assignee</th>
<td>poppler-bugs@lists.freedesktop.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>kislicynda@gmail.com
</td>
</tr></table>
<p>
<div>
<pre>Hi all,
I use pdftohtml 0.37.0 on Ubuntu.
When I call
pdftohtml -xml -fmt png
command - some images are extracted as .jpg (all with inverted colors) and some
as .png (all with normal colors).
When I call
pdfimages -all test.pdf test
command - I get same result for images (inverted .jpg and normal .png).
But when I call
pdfimages -png test.pdf test
command - I get only .png images and all of it has normal colors.
Questions:
1. Is it possible to convert pdf to html/xml using pdftohtml utility with
export all images to .png? Or at least to have non-inverted .jpg images?
Because now I need to call 2 different commands for same pdf page to get
correct result? It seems that `-fmt` option doesn't work
2. if using `pdfimages -all test.pdf test` command first image is extracted as
.jpg and second as .png - does it mean that first image is actually stored in
JPG format in pdf? and same for second image?
3. is it ok, if exported via `pdftohtml -xml` image has one resolution
(width-height), but another inside generated xml? for example, file has
width=145, height=145, but inside xml it has width=105, height=105?
PS: I can attach pdf file if needed
Thanks in advance,</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are the assignee for the bug.</li>
</ul>
</body>
</html>