[poppler] pdftocairo - Updated Patches

mpsuzuki at hiroshima-u.ac.jp mpsuzuki at hiroshima-u.ac.jp
Thu Aug 19 09:17:33 PDT 2010


On Thu, 19 Aug 2010 11:00:48 +0200
Stefan Thomas <thomas at txtbear.com> wrote:
>> Following modification can drop the page number suffix in
>> output filename when "split" flag is disabled. Stefan, could
>> you review?
>
>Makes sense and works great! Thanks!

Thank you for review.

>A Dissabte, 31 de juliol de 2010, Adrian Johnson va escriure:
>The reason for this change was that the previous way of determining 
>outRoot gave unwanted results when you gave it a remote PDF like:
>
>"pdftocairo -ps http://m.je/test.pdf" will try to create http://m.je/test.ps

I agree that 99% of such commands are accidentally executed
by the mistakes and fail simply, because the http servers are
not configured as WebDAV servers. However, if the full pathname
should be reflected to the output, it is reasonable for curl
library users to try to write the output on the remote servers.

>Didn't like URLs without a real filename either:
>
>"pdftocairo -ps http://example.com/get?format=pdf&asset=3948" will try 
>to create http://example.ps

Oops!

>Another problem was that it would create output files in the directory 
>where the PDF file resides rather than in the current working directory 
>which is unusual for *NIX command line tools.
>
>"pdftocairo -ps /media/cdrom0/pdfs/mytest.pdf" -> write error

Indeed.

>The current version (with mpsuzuki's fix) will create "cairoout.ps" in 
>the current working directory in all of those cases.

Oh, I'm sorry. Although I was aware that poppler-utils try
to write their outputs to the directory of source PDFs exist,
not to the current directory, the behaviour of my patch for
pdftocairo was not designed by careful thought. I slipped to
synchronize pdftocairo's behaviour to other poppler-utils
behaviour.

>Plus you can always 
>provide a second parameter if you'd like a different name. I'm liking 
>the predictability of it.
>
>As I see it we have three options:
>
>1. Current way: Always use "cairoout", unless otherwise specified by the 
>user.
>
>2. Adrian's way: Create outRoot from input filename. Perhaps with some 
>improvements like cutting off everything before the last slash (if it 
>exists), then everything after the first question mark (if it exists), 
>then everything after the last dot (if it exists). If the result is 
>empty or otherwise not a valid filename, use "cairoout".

I vote to this way.

>3. pdftoppm's way: Output to STDOUT unless a second parameter with an 
>output filename is provided.

This is not bad from the viewpoint of consistency with
pdftoppm, but I prefer the 2nd way. The existing poppler-utils
may have inconsistency. For example, the default output
of pdftotext is determined by replacing the pathname suffix
".pdf" by ".txt". To get the output via STDOUT, it is required
to give "-" as the output pathname. This is different from
pdftoppm.

Of course I want to hear other people's thougths.

Regards,
mpsuzuki


More information about the poppler mailing list