<html>
    <head>
      <base href="https://bugs.freedesktop.org/" />
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - pdftohtml ignore png format option and extract inverted jpg images"
   href="https://bugs.freedesktop.org/show_bug.cgi?id=92449">92449</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>pdftohtml ignore png format option and extract inverted jpg images
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>poppler
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>unspecified
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>x86-64 (AMD64)
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Linux (All)
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>major
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>medium
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>pdftohtml
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>poppler-bugs@lists.freedesktop.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>kislicynda@gmail.com
          </td>
        </tr></table>
      <p>
        <div>
        <pre>Hi all,

I use pdftohtml 0.37.0 on Ubuntu.

When I call 
pdftohtml -xml -fmt png
command - some images are extracted as .jpg (all with inverted colors) and some
as .png (all with normal colors).

When I call
pdfimages -all test.pdf test
command - I get same result for images (inverted .jpg and normal .png).

But when I call
pdfimages -png test.pdf test
command - I get only .png images and all of it has normal colors.

Questions:
1. Is it possible to convert pdf to html/xml using pdftohtml utility with
export all images to .png? Or at least to have non-inverted .jpg images?
Because now I need to call 2 different commands for same pdf page to get
correct result? It seems that `-fmt` option doesn't work
2. if using `pdfimages -all test.pdf test` command first image is extracted as
.jpg and second as .png - does it mean that first image is actually stored in
JPG format in pdf? and same for second image?
3. is it ok, if exported via `pdftohtml -xml` image has one resolution
(width-height), but another inside generated xml? for example, file has
width=145, height=145, but inside xml it has width=105, height=105?

PS: I can attach pdf file if needed

Thanks in advance,</pre>
        </div>
      </p>
      <hr>
      <span>You are receiving this mail because:</span>
      
      <ul>
          <li>You are the assignee for the bug.</li>
      </ul>
    </body>
</html>