[Poppler-bugs] [Bug 97262] Enumerate PDF named destinations

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Sat Sep 17 12:50:54 UTC 2016


https://bugs.freedesktop.org/show_bug.cgi?id=97262

--- Comment #50 from Masamichi Hosoda <trueroad at trueroad.jp> ---
(In reply to Carlos Garcia Campos from comment #48)
> (In reply to Adrian Johnson from comment #35)
> > Carlos, do you have an opinion on whether to return a GList of names or
> > binary tree of names/LinkDests?
> 
> I guess we could return a GTree, but I'm not sure I understand exactly what
> we want. Do we need the list to be in a specific order? Do they have an
> order in the PDF document that we should respect or alphabetical is what
> want?

I don't matter either GHash or GTree.
I'd like either of the followings:
  GTree with GBytes keys and PopplerDest value
  GHash with GBytes keys and PopplerDest value

> > > Almost UTF-16 strings contains \0.
> > 
> > Isn't the glib API UTF-8 based?
> 
> Yes, which means that strings passed to public api methods are expected to
> be UTF-8 and string returned by public api methods should also be UTF-8.

If I understand correctly,
the encoding of named destination names is not defined.

In other words, we cannot get the encoding of a name.
It might be UTF-16, UTF-8, PDFDocEncoding or other encoding.
It might be pure binary instead of text.

Document Management – Portable Document Format – Part 1: PDF 1.7, First Edition
Adobe (Jul 2008)
http://wwwimages.adobe.com/content/dam/Adobe/en/devnet/pdf/pdfs/PDF32000_2008.pdf

`12.3.2.3 Named Destinations'
says `..., a destination may be referred to indirectly by means of a name
object (PDF 1.1) or a byte string (PDF 1.2).'

So the named destination names are `byte strings' in PDF 1.2+.

`7.9.2.4 Byte String Type'
says `The byte string type shall be used for binary data that shall be
represented as a series of bytes, where each byte may be any value
representable in 8 bits.'

The point is that the named destination names are not `text strings'.
If `text strings', we can convert to UTF-8 since the encoding is clearly
defind.
However, the named destination names are `binary strings'.
So we cannot convert to UTF-8 since the encoding is not known.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/poppler-bugs/attachments/20160917/29ccee19/attachment.html>


More information about the Poppler-bugs mailing list