[poppler] Situation towards 3DPDF support

Sun Apr 16 00:58:31 UTC 2017

> Is it something "hard" or is it "read bit 3 and 4 from this stream"?

Partly yes.
Among several parts of the Intel code I had to consult, the most problematic one is related to the normal vector unpacking.

The reference document (ECMA-363 4th) states in section 9.6.1.3.4.11.1:

> To generate the array of predicted normals, start by
> putting the face normal for each face that uses this position into an array. While the size of
> this array is larger than New Normal Count, merge the two normals that are closest. Merging
> normals is done using a weighted spherical-linear average where each normal is weighted
> by the number of original face normals that it includes.

This statement is incomplete, because it does not say anything about the original order "the face normal for each face" is listed.
The implementation by Intel does not follow this instruction at all, and instead does the following:

"Start by putting the face normal for each face that uses this position into a temporary array, *in the same order as the adjacency listing*. (adjacency listing is a predefined data structure that is maintained during the whole decode process)
Pop the first element in the temporary array into the final array (=array of predicted normals).
While the size of the final array is smaller than required, pick a normal within the temporary array that is the farthest from other normals and pop it into the final array."

The resulting array is different with each other, even if we assume that the adjacency listing is used to define ordering in the first one.
(Merge while the array is too large vs. Pick the farthest one and pop)
I had to reproduce what the Intel code does step by step in order to avoid SIGSEGV in my implementation.
I am worried about licensing because you cannot write any piece of working code without doing so.

> Where do we do that?

I am sorry that my question was off the point. It was kind of an XY question...

After some searching, I noticed that in poppler, only PDF Renditions are saved to temporary files and loaded to media players.

What I wanted to ask was whether there is a recommendation (or regulation) as to when the frontends should load the data from PDF Stream, which I believe is unbuffered.
I think there are three options:

1. Read from Stream and decode when the page cache is created in background.
  (I think this method is adopted for Image)
2. Read from Stream into some buffer when the page cache is created in background. Start decoding later, on invocation by user.
3. Read from Stream and decode on-the-fly on incovation by user.
  (I think this method is adopted for Sound)

In the case of 3D data, both extracting from Stream (up to 10MiB) and decoding (up to a few seconds on my laptop) could be slow.
I was initially planning to create a wrapper around Stream (something that generalizes a byte string) which frontends could just pass to 3D decoder initializer (this is equivalent to option 3).
But it felt like the existing code, both frontends and backends, tends to avoid accessing Stream after pages are prepared for display.
So I wondered if there are reasons not to choose option 3.

Again, I am sorry for bothering you.

Regards,
Hiroka

From: poppler <poppler-bounces at lists.freedesktop.org> on behalf of Albert Astals Cid <aacid at kde.org>
Sent: Wednesday, April 12, 2017 21:16
To: poppler at lists.freedesktop.org
Subject: Re: [poppler] Situation towards 3DPDF support

El dimarts, 11 d’abril de 2017, a les 0:36:58 CEST, Hiroka IHARA va escriure:
> Hello,
> 
> I am a newbie who was working on 3DPDF support back in September.
> In case you still remember, I am sorry that I could not spend time on
> development recently.
> 
> After I submitted the last patch #97868 (I asked for review but you can
> ignore this one because it is obsoleted now), I noticed that the story was
> a bit more complicated than I had thought.
> 
> There are two issues that I need to address related to poppler (and a few
> more on PDF viewer frontends).
> 
> One is the licensing matter.
> I found out that the official documentation of Universal 3D Format is
> slightly outdated, and the Universal 3D Sample Software by Intel seems to
> be somehow acting as the de facto reference.
> 
> Therefore I had to adapt some part (currently a few dozens of lines) of
> Sample Software, which comes in Apache License, into mine, in GPL. I took
> care so that the actual code does not resemble, but the basic algorithm is
> the same.
> 
> Is it allowed to include code adapted from something APL into something GPL?

Seems not really, how copyrighteable is that code you adapted? Is it something 
"hard" or is it "read bit 3 and 4 from this stream"?

> The other is just a question I wanted to ask.
> 3DPDF data is relatively large (up to 10MiB), so I tried to learn from
> existing implementation of Movie, Sound and such media streams.
> 
> What I was curious about is why they extract all the data from Stream into
> tempfiles at first,

Where do we do that?

Cheers,
  Albert

> and read them again in PDF viewer(s) (though I do not
> know about anything other than evince).
> 
> Is it just because some frontend media players are happy that way, or
> because there are performance or some other issues?
> 
> My library will be happy if there is some public GObject to read from Stream
> byte by byte without dumping.
> 
> Lastly, I understand all this stuff is going to take your time, I will stop
> bothering you if you think it is going to be too costly.
> 
> Regards,
> Hiroka

_______________________________________________
poppler mailing list
poppler at lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/poppler

poppler Info Page - freedesktop.org
lists.freedesktop.org
Subscribing to poppler: Subscribe to poppler by filling out the following form. You will be sent email requesting confirmation, to prevent others from ...