[poppler] Access to Poppler internal C++ API by GDAL

jose.aliste at gmail.com jose.aliste at gmail.com
Sun Sep 10 14:05:24 UTC 2017


On Sun, Sep 10, 2017 at 10:57 AM, Even Rouault <even.rouault at spatialys.com>
wrote:

> On dimanche 10 septembre 2017 10:50:42 CEST jose.aliste at gmail.com wrote:
>
> > Hi,
>
> > the internal c++ API is not intended for use by external applications,
> only
>
> > by frontends. So, if you want a stable API, you should be using either of
>
> > the frontends. There is a cpp frontend that should be fare more stable.
>
>
>
> Jose,
>
>
>
> I'm aware of the cpp frontend, but unless I miss something, it is mostly
> about rendering or text search. GDAL needs go far beyond rendering as I
> explained.
>
>
>
sure, but the internal c++ api is not going to be stable... If the cpp is
missing parts of what you pointed out, then add patches to it with api that
you expect to be stable. I mean, add API that is general enough to be used
by others to the cpp frontend and then you can take some of your code
outside of your app into the cpp frontend. Or use the qt4 or qt5 frontend,
even if you are not going to render anything.


Kind regards



> Even
>
>
>
> >
>
> > Regards
>
> >
>
> >
>
> > On Sun, Sep 10, 2017 at 10:41 AM, Even Rouault <
> even.rouault at spatialys.com>
>
> >
>
> > wrote:
>
> > > Hi,
>
> > >
>
> > >
>
> > >
>
> > > I'm one of the developper of the GDAL library (http://gdal.org) that
>
> > > reads various raster & vector formats, mostly geospatial, including PDF
>
> > > and
>
> > > its georeferencing extensions (either expressed wtih Adobe Supplement
> to
>
> > > ISO 32000 or with Open Geospatial Consortium Best Practice:
>
> > >
>
> > > https://portal.opengeospatial.org/files/?artifact_id=40537 )
>
> > >
>
> > >
>
> > >
>
> > > Currently we use the Poppler internal C++ API and regularly must adjust
>
> > > for changes in it. Recently we had to do adjustments to accomodate for
>
> > > Poppler 0.58 changes. Supporting multiple Poppler versions begin to
> make
>
> > > our code ugly. So I and packagers from Linux distribution are
> wondering if
>
> > > there would be a way to access a more stable C++ API
>
> > >
>
> > >
>
> > >
>
> > > Besides rendering as image, we need really low-level access to PDF
>
> > > objects, to be able to parse georeferencing objects, retrieve layers,
> turn
>
> > > on/off OCG, or even access streams to decode drawing instructions so
> as to
>
> > > build vector objects
>
> > >
>
> > >
>
> > >
>
> > > I've tried to summarize below our current use of Poppler C++ API. I
>
> > > probably missed a few calls, but you should get the overall picture:
>
> > >
>
> > > - Object class: getType(), getTypeName(), getBool(), getInt(),
> getReal(),
>
> > > getString(), getName(), getStream(), getArray()
>
> > >
>
> > > - Dict class: lookupNF(), lookup(), getLength(), getKey()
>
> > >
>
> > > - Array class: getLength(), getNF(), get()
>
> > >
>
> > > - Stream class: getDict(), reset(), getChar(), fillGooString()
>
> > >
>
> > > - Catalog class: getPage(), getPageRef(), readMetadata()
>
> > >
>
> > > - GooString: getCString(), getLength()
>
> > >
>
> > > - Ref class: access to num and gen
>
> > >
>
> > > - PDFDoc class: isOk(), displayPageSlice(), getCatalog(),
>
> > > getOptContentConfig(), getNumPages(), getDocInfo(), getErrorCode(), str
>
> > > private member(accessed through a ugly "#define private public" before
>
> > > including poppler! we need to access it to be able to delete it with
> our
>
> > > heap since we allocated a stream object provided to PDFDoc()
> constructor.
>
> > > this is to avoid potential problems on Windows with cross-heap issues)
>
> > >
>
> > > - Page class: isOk(), pageObj private member (accessed through a ugly
>
> > > "#define private public" before including poppler!), getMediaBox()
>
> > >
>
> > > - OCGs class: isOk(), getOCGs()
>
> > >
>
> > > - GooList class: getLength(), get()
>
> > >
>
> > > - OptionalContentGroup class: setState()
>
> > >
>
> > > - SplashBitmap class: getBitmap(), getWidth(), getHeigh(),
> getDataPtr(),
>
> > > getAlphaPtr(), getAlphaRowSize(), getRowSize()
>
> > >
>
> > > - SplashOutputDev class: we subclass this class and override all/most
>
> > > virtual methods to be able to turn on/off rendering of various
> elements as
>
> > > we offer options to render selectively vector, raster and/or text
> elements
>
> > > (so basically just a conditional test to decide whether to return as a
>
> > > no-op or call the base implementation)
>
> > >
>
> > > - BaseStream class: we subclass this class to use GDAL own I/O
> abstraction
>
> > > layer (which beyond regular files can read in .zip files, in-memory
> files,
>
> > > files available through HTTP, etc...). So we implement copy(),
>
> > > makeSubStream(), getPos(), getStart(), setPos(), moveStart(),
> getKind(),
>
> > > getFileName(), getChar(), makeSubStream(), lookChar(), reset(),
>
> > > unfilteredReset(), close(), hasGetChars(), getChars()
>
> > >
>
> > > - GlobalParams class: setPrintCommands()
>
> > >
>
> > > - setErrorCallback() function
>
> > >
>
> > >
>
> > >
>
> > > If you want to glance at the code, the most relevant files are:
>
> > >
>
> > > https://github.com/OSGeo/gdal/blob/trunk/gdal/frmts/pdf/pdfobject.cpp
>
> > >
>
> > > https://github.com/OSGeo/gdal/blob/trunk/gdal/frmts/pdf/pdfio.cpp
>
> > >
>
> > > https://github.com/OSGeo/gdal/blob/trunk/gdal/frmts/pdf/pdfdataset.cpp
>
> > >
>
> > >
>
> > >
>
> > > I'm not clear if that would be feasible for Poppler to provide a more
>
> > > stable API for our use. At least, this makes you aware of external
> users
>
> > > of
>
> > > this API.
>
> > >
>
> > >
>
> > >
>
> > > Best regards,
>
> > >
>
> > >
>
> > >
>
> > > Even
>
> > >
>
> > >
>
> > >
>
> > > --
>
> > >
>
> > > Spatialys - Geospatial professional services
>
> > >
>
> > > http://www.spatialys.com
>
> > >
>
> > > _______________________________________________
>
> > > poppler mailing list
>
> > > poppler at lists.freedesktop.org
>
> > > https://lists.freedesktop.org/mailman/listinfo/poppler
>
>
>
>
>
> --
>
> Spatialys - Geospatial professional services
>
> http://www.spatialys.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/poppler/attachments/20170910/ee8a9264/attachment-0001.html>


More information about the poppler mailing list