[poppler] Access to Poppler internal C++ API by GDAL

Even Rouault even.rouault at spatialys.com
Sun Sep 10 13:57:24 UTC 2017


On dimanche 10 septembre 2017 10:50:42 CEST jose.aliste at gmail.com wrote:
> Hi,
> the internal c++ API is not intended for use by external applications, only
> by frontends. So, if you want a stable API, you should be using either of
> the frontends. There is a cpp frontend that should be fare more stable.

Jose,

I'm aware of the cpp frontend, but unless I miss something, it is mostly about rendering or 
text search. GDAL needs go far beyond rendering as I explained.

Even

> 
> Regards
> 
> 
> On Sun, Sep 10, 2017 at 10:41 AM, Even Rouault <even.rouault at spatialys.com>
> 
> wrote:
> > Hi,
> > 
> > 
> > 
> > I'm one of the developper of the GDAL library (http://gdal.org) that
> > reads various raster & vector formats, mostly geospatial, including PDF
> > and
> > its georeferencing extensions (either expressed wtih Adobe Supplement to
> > ISO 32000 or with Open Geospatial Consortium Best Practice:
> > 
> > https://portal.opengeospatial.org/files/?artifact_id=40537 )
> > 
> > 
> > 
> > Currently we use the Poppler internal C++ API and regularly must adjust
> > for changes in it. Recently we had to do adjustments to accomodate for
> > Poppler 0.58 changes. Supporting multiple Poppler versions begin to make
> > our code ugly. So I and packagers from Linux distribution are wondering if
> > there would be a way to access a more stable C++ API
> > 
> > 
> > 
> > Besides rendering as image, we need really low-level access to PDF
> > objects, to be able to parse georeferencing objects, retrieve layers, turn
> > on/off OCG, or even access streams to decode drawing instructions so as to
> > build vector objects
> > 
> > 
> > 
> > I've tried to summarize below our current use of Poppler C++ API. I
> > probably missed a few calls, but you should get the overall picture:
> > 
> > - Object class: getType(), getTypeName(), getBool(), getInt(), getReal(),
> > getString(), getName(), getStream(), getArray()
> > 
> > - Dict class: lookupNF(), lookup(), getLength(), getKey()
> > 
> > - Array class: getLength(), getNF(), get()
> > 
> > - Stream class: getDict(), reset(), getChar(), fillGooString()
> > 
> > - Catalog class: getPage(), getPageRef(), readMetadata()
> > 
> > - GooString: getCString(), getLength()
> > 
> > - Ref class: access to num and gen
> > 
> > - PDFDoc class: isOk(), displayPageSlice(), getCatalog(),
> > getOptContentConfig(), getNumPages(), getDocInfo(), getErrorCode(), str
> > private member(accessed through a ugly "#define private public" before
> > including poppler! we need to access it to be able to delete it with our
> > heap since we allocated a stream object provided to PDFDoc() constructor.
> > this is to avoid potential problems on Windows with cross-heap issues)
> > 
> > - Page class: isOk(), pageObj private member (accessed through a ugly
> > "#define private public" before including poppler!), getMediaBox()
> > 
> > - OCGs class: isOk(), getOCGs()
> > 
> > - GooList class: getLength(), get()
> > 
> > - OptionalContentGroup class: setState()
> > 
> > - SplashBitmap class: getBitmap(), getWidth(), getHeigh(), getDataPtr(),
> > getAlphaPtr(), getAlphaRowSize(), getRowSize()
> > 
> > - SplashOutputDev class: we subclass this class and override all/most
> > virtual methods to be able to turn on/off rendering of various elements as
> > we offer options to render selectively vector, raster and/or text elements
> > (so basically just a conditional test to decide whether to return as a
> > no-op or call the base implementation)
> > 
> > - BaseStream class: we subclass this class to use GDAL own I/O abstraction
> > layer (which beyond regular files can read in .zip files, in-memory files,
> > files available through HTTP, etc...). So we implement copy(),
> > makeSubStream(), getPos(), getStart(), setPos(), moveStart(), getKind(),
> > getFileName(), getChar(), makeSubStream(), lookChar(), reset(),
> > unfilteredReset(), close(), hasGetChars(), getChars()
> > 
> > - GlobalParams class: setPrintCommands()
> > 
> > - setErrorCallback() function
> > 
> > 
> > 
> > If you want to glance at the code, the most relevant files are:
> > 
> > https://github.com/OSGeo/gdal/blob/trunk/gdal/frmts/pdf/pdfobject.cpp
> > 
> > https://github.com/OSGeo/gdal/blob/trunk/gdal/frmts/pdf/pdfio.cpp
> > 
> > https://github.com/OSGeo/gdal/blob/trunk/gdal/frmts/pdf/pdfdataset.cpp
> > 
> > 
> > 
> > I'm not clear if that would be feasible for Poppler to provide a more
> > stable API for our use. At least, this makes you aware of external users
> > of
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/poppler/attachments/20170910/83d45148/attachment-0001.html>


More information about the poppler mailing list