<html>
<head>
<base href="https://bugs.freedesktop.org/">
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW - When extracting as XML all new lines are stripped"
href="https://bugs.freedesktop.org/show_bug.cgi?id=104230">104230</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>When extracting as XML all new lines are stripped
</td>
</tr>
<tr>
<th>Product</th>
<td>poppler
</td>
</tr>
<tr>
<th>Version</th>
<td>unspecified
</td>
</tr>
<tr>
<th>Hardware</th>
<td>Other
</td>
</tr>
<tr>
<th>OS</th>
<td>All
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>normal
</td>
</tr>
<tr>
<th>Priority</th>
<td>medium
</td>
</tr>
<tr>
<th>Component</th>
<td>pdftohtml
</td>
</tr>
<tr>
<th>Assignee</th>
<td>poppler-bugs@lists.freedesktop.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>clark@electrobeat.dk
</td>
</tr></table>
<p>
<div>
<pre>Created <span class=""><a href="attachment.cgi?id=136123" name="attach_136123" title="test pdf">attachment 136123</a> <a href="attachment.cgi?id=136123&action=edit" title="test pdf">[details]</a></span>
test pdf
pdftohtml -s -i -xml test.pdf out.xml
VS
pdftohtml -s -i test.pdf out.html
When you extract the text as HTML alle new lines are kept, but if you extract
the text as XML they are stripped out and each new line is put in a new tag</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are the assignee for the bug.</li>
</ul>
</body>
</html>