<html>
<head>
<base href="https://bugs.freedesktop.org/">
</head>
<body>
<p>
<div>
<b><a class="bz_bug_link
bz_status_NEW "
title="NEW - Enumerate PDF named destinations"
href="https://bugs.freedesktop.org/show_bug.cgi?id=97262#c50">Comment # 50</a>
on <a class="bz_bug_link
bz_status_NEW "
title="NEW - Enumerate PDF named destinations"
href="https://bugs.freedesktop.org/show_bug.cgi?id=97262">bug 97262</a>
from <span class="vcard"><a class="email" href="mailto:trueroad@trueroad.jp" title="Masamichi Hosoda <trueroad@trueroad.jp>"> <span class="fn">Masamichi Hosoda</span></a>
</span></b>
<pre>(In reply to Carlos Garcia Campos from <a href="show_bug.cgi?id=97262#c48">comment #48</a>)
<span class="quote">> (In reply to Adrian Johnson from <a href="show_bug.cgi?id=97262#c35">comment #35</a>)
> > Carlos, do you have an opinion on whether to return a GList of names or
> > binary tree of names/LinkDests?
>
> I guess we could return a GTree, but I'm not sure I understand exactly what
> we want. Do we need the list to be in a specific order? Do they have an
> order in the PDF document that we should respect or alphabetical is what
> want?</span >
I don't matter either GHash or GTree.
I'd like either of the followings:
GTree with GBytes keys and PopplerDest value
GHash with GBytes keys and PopplerDest value
<span class="quote">> > > Almost UTF-16 strings contains \0.
> >
> > Isn't the glib API UTF-8 based?
>
> Yes, which means that strings passed to public api methods are expected to
> be UTF-8 and string returned by public api methods should also be UTF-8.</span >
If I understand correctly,
the encoding of named destination names is not defined.
In other words, we cannot get the encoding of a name.
It might be UTF-16, UTF-8, PDFDocEncoding or other encoding.
It might be pure binary instead of text.
Document Management – Portable Document Format – Part 1: PDF 1.7, First Edition
Adobe (Jul 2008)
<a href="http://wwwimages.adobe.com/content/dam/Adobe/en/devnet/pdf/pdfs/PDF32000_2008.pdf">http://wwwimages.adobe.com/content/dam/Adobe/en/devnet/pdf/pdfs/PDF32000_2008.pdf</a>
`12.3.2.3 Named Destinations'
says `..., a destination may be referred to indirectly by means of a name
object (PDF 1.1) or a byte string (PDF 1.2).'
So the named destination names are `byte strings' in PDF 1.2+.
`7.9.2.4 Byte String Type'
says `The byte string type shall be used for binary data that shall be
represented as a series of bytes, where each byte may be any value
representable in 8 bits.'
The point is that the named destination names are not `text strings'.
If `text strings', we can convert to UTF-8 since the encoding is clearly
defind.
However, the named destination names are `binary strings'.
So we cannot convert to UTF-8 since the encoding is not known.</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are the assignee for the bug.</li>
</ul>
</body>
</html>