[poppler] pdf to image

Carl Karsten cfkarsten at gmail.com
Mon Dec 29 07:35:49 PST 2008


This mostly works, but:
the 3 lines w/ lambda to get the image from pixbuf bothers me a little,
pdf.replace(0,a)  bothers me allot
I am hoping the font thing is a result of the .replace(0,a)

#!/usr/bin/python
# p2i.py
# test of converting a pdf to an image.
import poppler
from gtk.gdk import Pixbuf,COLORSPACE_RGB

def pdf2png(pdf,size):

    pixbuf = Pixbuf(COLORSPACE_RGB,False,8,size[0],size[1])

    p = poppler.document_new_from_data(pdf,len(pdf),password='')
    p0=p.get_page(0)
    p0.render_to_pixbuf(0,0,8,11,1,0,pixbuf)

    # There has to be a better way to get the image?
    lst=[]
    pixbuf.save_to_callback(lambda b,l: l.append(b), 'png', user_data=lst)
    png=''.join(lst)

    return png


if __name__=='__main__':

    pdf=open('foo.pdf','rb').read()

    # I do this to avoid
    # TypeError: document_new_from_data() argument 1 must be string
without null bytes, not str
    # which is probably an indication I am not doing somthing right.
    pdf=(pdf.replace('\x00','a'))

    png=pdf2png(pdf,(286,225))

    open('test.png','wb').write(png)

"""
juser at cp666:~$ python p2i.py
Xlib:  extension "RANDR" missing on display "localhost:12.0".
Error: failed to load truetype font

some font thing failed
Error: failed to load truetype font

some font thing failed

juser at cp666:~$ file test.png
test.png: PNG image, 300 x 300, 8-bit/color RGB, non-interlaced
"""

Carl K


More information about the poppler mailing list