[poppler] Trying to extend pdftoppm to use CairoOutputDev

Albert Astals Cid aacid at kde.org
Wed Dec 16 15:18:54 PST 2009


A Dimarts 15 Desembre 2009 02:28:54, mpsuzuki at hiroshima-u.ac.jp va escriure:
> Dear Poppler developers,

Hi

> 
> Before all, I thank poppler developers for writing excellent
> software. The addition of CairoOutputDev is very interesting.
> 
> Now I'm trying to extend pdftoppm to draw on CairoOutputDev.
> My motivation is splitting a large table in PDF document
> into small PDFs for each cell.
> 
> Recent poppler has a feature to draw on cairo surface, so
> I think it is possible to do such by pdftoppm draw a cell
> (by the specification of geometry for a cell) on cairo surface,
> something like:
> 
>   pdftoppm \
>            -f [page_num] -l [page_num] \
>            -r [dpi_to_specify_the_unit_of_geometry] \
>            -x [cell_pos_x] -y [cell_pos_y]          \
>            -w [cell_width] -h [cell_height]         \
>            -pdf [input_table.pdf] [output_cell_prefix]
> 
> Attached patch is an experiment doing such, please comment
> what should be improved for the official adoption.
> 
> By default, "-r" option for "pdftoppm -pdf" is used only
> as an unit to calculate the geometry to be cropped, and
> it does not change the resolution of output PDF. This is
> inconsistent with "-r" option for SplashOutputDev cases.
> If MODIFY_RESOLUTION_IN_PDF2CAIRO is defined in the compilation,
> the behaviour of "-r" is consistent with the case of
> SplashOutputDev.
> 
> The problems that I've already recognized are:
> 
> * If a PDF including large image (e.g. PDF generated by
>   image scanners) is given, the cropped PDF includes
>   whole image object, not cropped image object.
>   The filesize of cropped PDF is not reduced.
> 
> * When multiple pages are rendered (e.g. pdftoppm -pdf
>   -f 1 -l 100 ...), startDoc() is invoked for each
>   output file. As a result, the rendering speed is
>   slower than that of SplashOutputDev.

I'm not really sure the use case is really that useful for the general public 
to really include this in poppler utils, anyway there is a lot of cairo 
specific code that needs to be properly ifdefed because cairo is an optional 
dependency.

Also i think this belongs more into a separate binary than in the pdftoppm 
one. We already "bastardized" it adding png and jpeg, maybe we should rename 
it to pdfconvert or something like that.

Anyway more than in the code itself i'm concerned in the utility, do you (not 
only mpsuzuki i'm interested in everyone's opinion) think anyone ever will 
have the need of splitting a PDF file in chunks and will think of using 
pdftoppm?

Albert

> 
> Regards,
> mpsuzuki
> 


More information about the poppler mailing list