[poppler] pdftops creates huge file with simple color background (attached examples)

Pierre-Luc Samuel Pierre-Luc.Samuel at ticketmaster.com
Fri Jan 29 11:06:45 PST 2016


Hum, yeah RunLengthDecode doesn't seem to be the best algorithm for this 
kind of image.  Well, it's not really a good compression algorithm at 
all from what I see!

An interesting fact I found was that if I pass my 27 mb file to ps2ps 
(ghostscript ps2write device), I end up with a 1.7 MB file that is 
"/ASCII85Decode filter /LZWDecode filter".  I don't know much about 
these decoding algorithms, but it would be really nice if that kind of 
post-compression happened directly in poppler's pdftops.

I'd be willing to help if someone helped me figure it out.  I see 
poppler already has a LZWStream class, would it simply be a matter of 
pluging it in somewhere in PSOutputDev.cc, in place or in addition to 
RunLengthDecode?

Pierre-Luc

On 01/27/2016 01:55 PM, William Bader wrote:
> tux-yellow and tux-white both convert to a 2549x3299 RGB bitmap that 
> is RunLength compressed and ASCII85 encoded.
>
> The yellow file is larger than the white file because "255 194 14" 
> does not compress as well as "255 255 255".
>
> The original tux image was Flate encoded with /DecodeParms of 
> <</Predictor 15/Columns 512>>
>
> I am not a poppler maintainer, but I think that it should be possible 
> to add an option to do Flate compression.
>
> If you want to look at the code, open poppler/PSOutputDev.cc and 
> search for occurrences of /RunLengthDecode
>
> The "nothing" files are small because they paint the background by 
> drawing a box instead of by copying a bitmapped image.
>
> I think that when a PDF has several images on top of each other, 
> pdftops needs to convert the entire area to a bitmap even if some of 
> the parts were originally drawn with vector commands. The original 
> images have a bitmapped tux over a vector background, but pdftops 
> can't separate them and has to rasterize the entire page.
>
> Regards,
>
> William
>
>
> To: poppler at lists.freedesktop.org
> From: Pierre-Luc.Samuel at ticketmaster.com
> Date: Tue, 26 Jan 2016 14:19:17 -0500
> Subject: [poppler] pdftops creates huge file with simple color 
> background (attached examples)
>
> Hi poppler team,
>   
> I have an issue with pdftops version 0.39.0 with conversion of some
> specific templates to postscript.  I have created very simple use cases
> so that you can understand the issue.
>   
> pdftops tux-white.pdf
> pdftops tux-yellow.pdf
> ls -al *.ps
> -rw-r--r-- 1   2816703 Jan 26 11:53 tux-white.ps
> -rw-r--r-- 1  27576263 Jan 26 11:53 tux-yellow.ps
>   
> The size of the second PS is 27MB, but only the background color has
> changed.  This seems related to the fact that there is an image on the
> template, because if I remove the image, there is no significant size
> difference:
>   
> pdftops nothing-white.pdf
> pdftops nothing-yellow.pdf
> ls -al *.ps
> -rw-r--r-- 1     11129 Jan 26 10:34 nothing-white.ps
> -rw-r--r-- 1     11167 Jan 26 10:34 nothing-yellow.ps
>   
> Is this a known issue?
>   
> Thanks!
> Pierre-Luc
>
> _______________________________________________ poppler mailing list 
> poppler at lists.freedesktop.org 
> http://lists.freedesktop.org/mailman/listinfo/poppler

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/poppler/attachments/20160129/3aedfe0d/attachment.html>


More information about the poppler mailing list