[poppler] pdftocairo - Updated Patches

Stefan Thomas thomas at txtbear.com
Thu Aug 19 02:00:48 PDT 2010


  Hey again,

> Following modification can drop the page number suffix in
> output filename when "split" flag is disabled. Stefan, could
> you review?

Makes sense and works great! Thanks!

A Dissabte, 31 de juliol de 2010, Adrian Johnson va escriure:

> I've just done some testing with the patches and found a problem with
> the generated output file name.
>
> My original patch would, if no output file name is specified, use the
> input filename to generate the output name like what pdftops does. eg
>
> "pdftocairo -ps foo.pdf" will create foo.ps
> "pdftocairo -png foo.pdf" will create foo-001.png, foo-002.png, ...
>
> the updated patches are doing:
>
> "pdftocairo -ps foo.pdf" will create cairoout-001.ps
> "pdftocairo -png foo.pdf" will create cairoout-001.png,
> cairoout-002.png, ...
>
> which is not very user friendly.

The reason for this change was that the previous way of determining 
outRoot gave unwanted results when you gave it a remote PDF like:

"pdftocairo -ps http://m.je/test.pdf" will try to create http://m.je/test.ps

Didn't like URLs without a real filename either:

"pdftocairo -ps http://example.com/get?format=pdf&asset=3948" will try 
to create http://example.ps

Another problem was that it would create output files in the directory 
where the PDF file resides rather than in the current working directory 
which is unusual for *NIX command line tools.

"pdftocairo -ps /media/cdrom0/pdfs/mytest.pdf" -> write error

The current version (with mpsuzuki's fix) will create "cairoout.ps" in 
the current working directory in all of those cases. Plus you can always 
provide a second parameter if you'd like a different name. I'm liking 
the predictability of it.

As I see it we have three options:

1. Current way: Always use "cairoout", unless otherwise specified by the 
user.

2. Adrian's way: Create outRoot from input filename. Perhaps with some 
improvements like cutting off everything before the last slash (if it 
exists), then everything after the first question mark (if it exists), 
then everything after the last dot (if it exists). If the result is 
empty or otherwise not a valid filename, use "cairoout".

3. pdftoppm's way: Output to STDOUT unless a second parameter with an 
output filename is provided.

I'd be happy with any of these. I'm liking about number one that I know 
there aren't any bugs in it. With number 2 I feel like there is always 
going to be a URL or filename that breaks it/acts weird. I'm liking 
about number three that it's consistent with pdftoppm.

Thoughts?

I'll make a final patch once we've decided the filename issue.

Cheers,

Stefan


More information about the poppler mailing list