[poppler] Cached files, reading stdin, http streams and PDFDocBuilder

Hib Eris hib at hiberis.nl
Wed Feb 24 14:06:55 PST 2010


Hi all,

I have reworked the patches for HTTP streaming support that Stefan
Thomas send in last October [1].

The goal of these patches is to support remote files in poppler. For
example, you can do

$ pdfinfo 'http://www.example.com/document.pdf'

To efficiently read a remote file, the parts of it that are retrieved
are cached. The caching part is implemented in

[Patch 1]: Add support for cached files

To use the caching, you need to implement a CacheLoader. For remote
files I have implemented a loader that uses libcurl.
The caching can also be used for reading from stdin:

[Patch 2]: Add support for reading a cached file from stdin

As an example how this can be used, I added it to pdfinfo, whose
support for reading from stdin was broken.

[Patch 3]: Use cached files to read from stdin in pdfinfo

To see this working, you can do

$ cat document.pdf | pdfinfo -

Patch 4 gives us a CurlCacheLoader.

[Patch 4]: Add HTTP support using libcurl

As an example, I have added this to pdfinfo:

[Patch 5]: Let pdfinfo read documents over HTTP

You can use it with:

$ pdfinfo 'http://www.example.com/document.pdf'


Now, mission accomplished. Yes, but clearly, putting the logic about
how to create a PDFDoc into pdfinfo is not the best we can do.
To allow dynamically extending the transfer protocols that poppler
uses, I decided to create a PDFDocBuilder.

[Patch 6]: Add PDFDocBuilder

Implementations of this for local files and for reading from stdin:

[Patch 7]: Add LocalPDFDocBuilder and StdinPDFDocBuilder

With PopplerPDFDocBuilder we provide a default builder for application
developers that supports the protocols we want it to support, for now
local files and stdin:

[Patch 8]: Add PopplerPDFDocBuilder

If poppler builds with libcurl, we can add the CurlPDFDocBuilder to
the PopplerPDFDocBuilder:

[Patch 9]; Applying: Add CurlPDFDocBuilder

With the PDFDocBuilders in place, we can clean up all our utils,
giving all of them stdin and http:// capabilities.

[Patch 10]: Use PDFDocBuilder in utils

And one more unrelated patch, because I stumbled upon it testing pstotext:

[Patch 11]: Initialize variable in TextOutputDev


Please let me know what you think of these patches or just commit them
to git if you like them.


Kind regards,

Hib Eris




















[1] http://lists.freedesktop.org/archives/poppler/2009-October/005162.html
-------------- next part --------------
From bad1c929ecd01a19fac1e185ec57931d6d2d8d22 Mon Sep 17 00:00:00 2001
From: Hib Eris <hib at hiberis.nl>
Date: Tue, 23 Feb 2010 01:55:00 +0100
Subject: [PATCH 01/11] Add support for cached files

---
 CMakeLists.txt        |    2 +
 poppler/CachedFile.cc |  203 +++++++++++++++++++++++++++++++++++++++++++++++++
 poppler/CachedFile.h  |   81 ++++++++++++++++++++
 poppler/Makefile.am   |    2 +
 poppler/Stream.cc     |  104 +++++++++++++++++++++++++
 poppler/Stream.h      |   50 ++++++++++++
 6 files changed, 442 insertions(+), 0 deletions(-)
 create mode 100644 poppler/CachedFile.cc
 create mode 100644 poppler/CachedFile.h

diff --git a/CMakeLists.txt b/CMakeLists.txt
index 329a6de..c2f0176 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -183,6 +183,7 @@ set(poppler_SRCS
   poppler/Array.cc
   poppler/BuiltinFont.cc
   poppler/BuiltinFontTables.cc
+  poppler/CachedFile.cc
   poppler/Catalog.cc
   poppler/CharCodeToUnicode.cc
   poppler/CMap.cc
@@ -309,6 +310,7 @@ if(ENABLE_XPDF_HEADERS)
     poppler/Array.h
     poppler/BuiltinFont.h
     poppler/BuiltinFontTables.h
+    poppler/CachedFile.h
     poppler/Catalog.h
     poppler/CharCodeToUnicode.h
     poppler/CMap.h
diff --git a/poppler/CachedFile.cc b/poppler/CachedFile.cc
new file mode 100644
index 0000000..7867814
--- /dev/null
+++ b/poppler/CachedFile.cc
@@ -0,0 +1,203 @@
+//========================================================================
+//
+// CachedFile.cc
+//
+// This file is licensed under the GPLv2 or later
+//
+// Copyright 2009 Stefan Thomas <thomas at eload24.com>
+// Copyright 2010 Hib Eris <hib at hiberis.nl>
+//
+//========================================================================
+
+#include <config.h>
+
+#include "CachedFile.h"
+
+#include <stdio.h>
+#include <string.h>
+#include "Error.h"
+#include <curl/curl.h>
+
+//------------------------------------------------------------------------
+
+CachedFile::CachedFile(CacheLoader *cacheLoader, GooString *uri)
+{
+  _uri = uri;
+  _cacheLoader = cacheLoader;
+
+  streamPos = 0;
+  size = _cacheLoader->init(_uri, this);
+  ref = 0;
+}
+
+CachedFile::~CachedFile()
+{
+  if (_uri) {
+    delete _uri;
+  }
+  if ( _cacheLoader) {
+    delete _cacheLoader;
+  }
+}
+
+GooString *CachedFile::getFileName()
+{
+  int i, sl = 0, qm = 0;
+  for (i = 6; i < _uri->getLength(); i++) {
+    // note position after last slash
+    if (_uri->getChar(i) == '/') sl = i+1;
+
+    // note position of first question mark
+    if (_uri->getChar(i) == '?' && !qm) qm = i;
+  }
+  // find document filename
+  return new GooString(_uri, sl, (qm) ? qm : (_uri->getLength()-sl));
+}
+
+long int CachedFile::tell() {
+  return streamPos;
+}
+
+int CachedFile::seek(long int offset, int origin)
+{
+  if (origin == SEEK_SET) {
+    streamPos = offset;
+  } else if (origin == SEEK_CUR) {
+    streamPos += offset;
+  } else {
+    streamPos = size + offset;
+  }
+
+  if (streamPos > size) {
+    streamPos = 0;
+    return 1;
+  }
+
+  return 0;
+}
+
+size_t CachedFile::read(void *ptr, size_t unitsize, size_t count)
+{
+  size_t bytes = unitsize*count;
+  size_t endPos = streamPos + bytes;
+
+//printf("Reading %li - %li\n", streamPos, streamPos + unitsize*count);
+
+  if (endPos > size) {
+    endPos = size;
+    bytes = size - streamPos;
+  }
+
+  if (bytes == 0) return 0;
+
+  preload(streamPos, endPos);
+
+  // Write data to buffer
+  size_t toCopy = bytes;
+
+  while (toCopy) {
+    int chunk = streamPos / CachedFileChunkSize;
+    int offset = streamPos % CachedFileChunkSize;
+    size_t len = CachedFileChunkSize-offset;
+
+    if (len > toCopy)
+      len = toCopy;
+
+//printf("Reading Chunk %i, offset %i, len %i\n", chunk, offset, len);
+
+    memcpy(ptr, chunks[chunk].data + offset, len);
+    streamPos += len;
+    toCopy -= len;
+    ptr = (char*)ptr + len;
+
+    /*
+    // Dump a chunk
+    if (chunk == 28 || chunk == 29) {
+      for (int i = 0; i < len; ++i) {
+        printf("%02X ", (unsigned char) chunks[chunk].data[offset + i]);
+      }
+      printf("\n");
+    }
+    */
+  }
+
+  return bytes;
+}
+
+void CachedFile::preload(size_t start, size_t end)
+{
+  if (end == 0 || end > size) end = size;
+  if (start > end) start = end - CachedFileChunkSize;
+
+  int startBlock = start / CachedFileChunkSize;
+  int endBlock = (end-1) / CachedFileChunkSize;
+
+//printf("Get block %i to %i.\n", startBlock, endBlock);
+
+  // Make sure data is in cache
+  loadChunks(startBlock, endBlock);
+}
+
+void CachedFile::loadChunks(int startBlock, int endBlock)
+{
+  int startSequence;
+  int i = startBlock;
+
+//printf("loadChunks form %d to %d\n", startBlock, endBlock);
+
+  while (i <= endBlock) {
+    if (chunks[i].state == bccStateNew) {
+      startSequence = i;
+      while (i < endBlock) {
+        i++;
+        if (chunks[i].state != bccStateNew) {
+          i--;
+          break;
+        }
+      }
+
+      size_t fromByte = startSequence * CachedFileChunkSize;
+      size_t toByte = ((i+1) * CachedFileChunkSize)-1;
+
+      if (toByte >= size-1) { toByte = size-1; }
+
+      _cacheLoader->load(fromByte, toByte);
+
+      for (int j = startSequence; j <= i; j++) {
+        chunks[j].state = bccStateLoaded;
+      }
+    }
+    i++;
+  }
+}
+
+size_t CachedFile::write(const char *ptr, size_t size, size_t fromByte)
+{
+//printf("%u bytes received\n", size);
+
+  size_t currentByte = fromByte;
+  size_t toCopy = size;
+  const char *cp = ptr;
+
+  while (toCopy) {
+    int chunk = currentByte / CachedFileChunkSize;
+    int offset = currentByte % CachedFileChunkSize;
+
+    size_t len = CachedFileChunkSize-offset;
+
+    if (len > toCopy)
+      len = toCopy;
+
+//printf("Writing Chunk %i, offset %i, len %i\n", chunk, offset, len);
+
+    memcpy(&(chunks[chunk].data[offset]), cp, len);
+    currentByte += len;
+    toCopy -= len;
+    cp = cp + len;
+  }
+
+  return size;
+}
+
+
+
diff --git a/poppler/CachedFile.h b/poppler/CachedFile.h
new file mode 100644
index 0000000..c4e000d
--- /dev/null
+++ b/poppler/CachedFile.h
@@ -0,0 +1,81 @@
+//========================================================================
+//
+// CachedFile.h
+//
+// Caching files support.
+//
+// This file is licensed under the GPLv2 or later
+//
+// Copyright 2009 Stefan Thomas <thomas at eload24.com>
+// Copyright 2010 Hib Eris <hib at hiberis.nl>
+//
+//========================================================================
+
+#ifndef CACHEDFILE_H
+#define CACHEDFILE_H
+
+#include "poppler-config.h"
+#include "goo/GooString.h"
+
+#include <map>
+
+
+#define CachedFileChunkSize 8192
+
+enum CachedFileChunkState {
+  bccStateNew,
+  // If we want this to be thread-safe, concurrent, whatever, we need
+  // another state:
+  // bccStateLoading,
+  bccStateLoaded
+};
+
+typedef struct {
+  CachedFileChunkState state;
+  char data[CachedFileChunkSize];
+} CachedFileChunk;
+
+class CacheLoader;
+
+class CachedFile {
+
+public:
+
+  CachedFile(CacheLoader *cacheLoader, GooString *uri);
+  ~CachedFile();
+
+  long int tell();
+  int seek(long int offset, int origin);
+  size_t read(void * ptr, size_t unitsize, size_t count);
+  GooString *getFileName();
+  void preload(size_t start, size_t end);
+  size_t write(const char *ptr, size_t size, size_t fromByte);
+
+  // Reference counting.
+  int incRef() { return ++ref; }
+  int decRef() { return --ref; }
+
+private:
+
+  CacheLoader *_cacheLoader;
+  GooString *_uri;
+  size_t streamPos;
+  std::map<unsigned, CachedFileChunk> chunks;
+  size_t size;
+  int ref;  // reference count
+
+  void loadChunks(int startBlock, int endBlock);
+
+};
+
+class CacheLoader {
+
+public:
+
+  virtual ~CacheLoader() {};
+  virtual size_t init(GooString *uri, CachedFile *cachedFile) = 0 ;
+  virtual void load(size_t fromByte, size_t toByte) = 0;
+
+};
+
+#endif
diff --git a/poppler/Makefile.am b/poppler/Makefile.am
index 5f6a94a..b082778 100644
--- a/poppler/Makefile.am
+++ b/poppler/Makefile.am
@@ -167,6 +167,7 @@ poppler_include_HEADERS =	\
 	Array.h			\
 	BuiltinFont.h		\
 	BuiltinFontTables.h	\
+	CachedFile.h		\
 	Catalog.h		\
 	CharCodeToUnicode.h	\
 	CMap.h			\
@@ -238,6 +239,7 @@ libpoppler_la_SOURCES =		\
 	Array.cc 		\
 	BuiltinFont.cc		\
 	BuiltinFontTables.cc	\
+	CachedFile.cc		\
 	Catalog.cc 		\
 	CharCodeToUnicode.cc	\
 	CMap.cc			\
diff --git a/poppler/Stream.cc b/poppler/Stream.cc
index 6634317..82d5fa6 100644
--- a/poppler/Stream.cc
+++ b/poppler/Stream.cc
@@ -19,6 +19,8 @@
 // Copyright (C) 2008 Julien Rebetez <julien at fhtagn.net>
 // Copyright (C) 2009 Carlos Garcia Campos <carlosgc at gnome.org>
 // Copyright (C) 2009 Glenn Ganz <glenn.ganz at uptime.ch>
+// Copyright (C) 2009 Stefan Thomas <thomas at eload24.com>
+// Copyright (C) 2010 Hib Eris <hib at hiberis.nl>
 //
 // To see a description of the changes please see the Changelog file that
 // came with your tarball or type make ChangeLog if you are building from git
@@ -794,6 +796,108 @@ void FileStream::moveStart(int delta) {
 }
 
 //------------------------------------------------------------------------
+// CachedFileStream
+//------------------------------------------------------------------------
+
+CachedFileStream::CachedFileStream(CachedFile *ccA, Guint startA,
+        GBool limitedA, Guint lengthA, Object *dictA)
+  : BaseStream(dictA)
+{
+  cc = ccA;
+  cc->incRef();
+  start = startA;
+  limited = limitedA;
+  length = lengthA;
+  bufPtr = bufEnd = buf;
+  bufPos = start;
+  savePos = 0;
+  saved = gFalse;
+}
+
+CachedFileStream::~CachedFileStream()
+{
+  if (!cc->decRef())
+  {
+    close();
+    delete cc;
+  }
+}
+
+Stream *CachedFileStream::makeSubStream(Guint startA, GBool limitedA,
+        Guint lengthA, Object *dictA)
+{
+  return new CachedFileStream(cc, startA, limitedA, lengthA, dictA);
+}
+
+void CachedFileStream::reset()
+{
+  savePos = (Guint)cc->tell();
+  cc->seek(start, SEEK_SET);
+
+  saved = gTrue;
+  bufPtr = bufEnd = buf;
+  bufPos = start;
+}
+
+void CachedFileStream::close()
+{
+  if (saved) {
+    cc->seek(savePos, SEEK_SET);
+    saved = gFalse;
+  }
+}
+
+GBool CachedFileStream::fillBuf()
+{
+  int n;
+
+  bufPos += bufEnd - buf;
+  bufPtr = bufEnd = buf;
+  if (limited && bufPos >= start + length) {
+    return gFalse;
+  }
+  if (limited && bufPos + cachedStreamBufSize > start + length) {
+    n = start + length - bufPos;
+  } else {
+    n = cachedStreamBufSize;
+  }
+  cc->read(buf, 1, n);
+  bufEnd = buf + n;
+  if (bufPtr >= bufEnd) {
+    return gFalse;
+  }
+  return gTrue;
+}
+
+void CachedFileStream::setPos(Guint pos, int dir)
+{
+  Guint size;
+
+  if (dir >= 0) {
+    cc->seek(pos, SEEK_SET);
+    bufPos = pos;
+  } else {
+    cc->seek(0, SEEK_END);
+    size = (Guint)cc->tell();
+
+    if (pos > size)
+      pos = (Guint)size;
+
+    cc->seek(-(int)pos, SEEK_END);
+    bufPos = (Guint)cc->tell();
+  }
+
+  bufPtr = bufEnd = buf;
+}
+
+void CachedFileStream::moveStart(int delta)
+{
+  start += delta;
+  bufPtr = bufEnd = buf;
+  bufPos = start;
+}
+
+//------------------------------------------------------------------------
 // MemStream
 //------------------------------------------------------------------------
 
diff --git a/poppler/Stream.h b/poppler/Stream.h
index 9c0068e..d9d1907 100644
--- a/poppler/Stream.h
+++ b/poppler/Stream.h
@@ -17,6 +17,8 @@
 // Copyright (C) 2008 Julien Rebetez <julien at fhtagn.net>
 // Copyright (C) 2008 Albert Astals Cid <aacid at kde.org>
 // Copyright (C) 2009 Carlos Garcia Campos <carlosgc at gnome.org>
+// Copyright (C) 2009 Stefan Thomas <thomas at eload24.com>
+// Copyright (C) 2010 Hib Eris <hib at hiberis.nl>
 //
 // To see a description of the changes please see the Changelog file that
 // came with your tarball or type make ChangeLog if you are building from git
@@ -33,6 +35,7 @@
 #include <stdio.h>
 #include "goo/gtypes.h"
 #include "Object.h"
+#include "CachedFile.h"
 
 class BaseStream;
 
@@ -40,6 +43,7 @@ class BaseStream;
 
 enum StreamKind {
   strFile,
+  strCachedFile,
   strASCIIHex,
   strASCII85,
   strLZW,
@@ -399,6 +403,52 @@ private:
 };
 
 //------------------------------------------------------------------------
+// CachedFileStream
+//------------------------------------------------------------------------
+
+#define cachedStreamBufSize 1024
+
+class CachedFileStream: public BaseStream {
+public:
+
+  CachedFileStream(CachedFile *ccA, Guint startA, GBool limitedA,
+	     Guint lengthA, Object *dictA);
+  virtual ~CachedFileStream();
+  virtual Stream *makeSubStream(Guint startA, GBool limitedA,
+				Guint lengthA, Object *dictA);
+  virtual StreamKind getKind() { return strCachedFile; }
+  virtual void reset();
+  virtual void close();
+  virtual int getChar()
+    { return (bufPtr >= bufEnd && !fillBuf()) ? EOF : (*bufPtr++ & 0xff); }
+  virtual int lookChar()
+    { return (bufPtr >= bufEnd && !fillBuf()) ? EOF : (*bufPtr & 0xff); }
+  virtual int getPos() { return bufPos + (bufPtr - buf); }
+  virtual void setPos(Guint pos, int dir = 0);
+  virtual Guint getStart() { return start; }
+  virtual void moveStart(int delta);
+
+  virtual int getUnfilteredChar () { return getChar(); }
+  virtual void unfilteredReset () { reset(); }
+
+private:
+
+  GBool fillBuf();
+
+  CachedFile *cc;
+  Guint start;
+  GBool limited;
+  Guint length;
+  char buf[cachedStreamBufSize];
+  char *bufPtr;
+  char *bufEnd;
+  Guint bufPos;
+  int savePos;
+  GBool saved;
+};
+
+
+//------------------------------------------------------------------------
 // MemStream
 //------------------------------------------------------------------------
 
-- 
1.6.3.3


From 91f95c07943f1a06a43726b1e616f395ddafa613 Mon Sep 17 00:00:00 2001
From: Hib Eris <hib at hiberis.nl>
Date: Tue, 23 Feb 2010 02:02:10 +0100
Subject: [PATCH 02/11] Add support for reading a cached file from stdin

---
 CMakeLists.txt              |    2 ++
 poppler/Makefile.am         |    2 ++
 poppler/StdinCacheLoader.cc |   34 ++++++++++++++++++++++++++++++++++
 poppler/StdinCacheLoader.h  |   26 ++++++++++++++++++++++++++
 4 files changed, 64 insertions(+), 0 deletions(-)
 create mode 100644 poppler/StdinCacheLoader.cc
 create mode 100644 poppler/StdinCacheLoader.h

diff --git a/CMakeLists.txt b/CMakeLists.txt
index c2f0176..85bb98f 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -226,6 +226,7 @@ set(poppler_SRCS
   poppler/TextOutputDev.cc
   poppler/PageLabelInfo.cc
   poppler/SecurityHandler.cc
+  poppler/StdinCacheLoader.cc
   poppler/Sound.cc
   poppler/XpdfPluginAPI.cc
   poppler/Movie.cc
@@ -364,6 +365,7 @@ if(ENABLE_XPDF_HEADERS)
     poppler/PSOutputDev.h
     poppler/TextOutputDev.h
     poppler/SecurityHandler.h
+    poppler/StdinCacheLoader.h
     poppler/UTF8.h
     poppler/XpdfPluginAPI.h
     poppler/Sound.h
diff --git a/poppler/Makefile.am b/poppler/Makefile.am
index b082778..699d084 100644
--- a/poppler/Makefile.am
+++ b/poppler/Makefile.am
@@ -204,6 +204,7 @@ poppler_include_HEADERS =	\
 	ProfileData.h		\
 	PreScanOutputDev.h	\
 	PSTokenizer.h		\
+	StdinCacheLoader.h	\
 	Stream-CCITT.h		\
 	Stream.h		\
 	UnicodeMap.h		\
@@ -275,6 +276,7 @@ libpoppler_la_SOURCES =		\
 	ProfileData.cc		\
 	PreScanOutputDev.cc \
 	PSTokenizer.cc		\
+	StdinCacheLoader.cc	\
 	Stream.cc 		\
 	UnicodeMap.cc		\
 	UnicodeTypeTable.cc	\
diff --git a/poppler/StdinCacheLoader.cc b/poppler/StdinCacheLoader.cc
new file mode 100644
index 0000000..fb58613
--- /dev/null
+++ b/poppler/StdinCacheLoader.cc
@@ -0,0 +1,34 @@
+//========================================================================
+//
+// StdinCacheLoader.cc
+//
+// This file is licensed under the GPLv2 or later
+//
+// Copyright 2010 Hib Eris <hib at hiberis.nl>
+//
+//========================================================================
+
+#include <config.h>
+
+#include "StdinCacheLoader.h"
+
+#include <stdio.h>
+
+size_t
+StdinCacheLoader::init(GooString *dummy, CachedFile *cachedFile)
+{
+  size_t read, size = 0;
+  char buf[1024];
+
+  do {
+    read = fread(buf, 1, 1024, stdin);
+    size += (cachedFile->write) (buf, read, size);
+  }
+  while (read == 1024);
+
+  return size;
+}
+
+void StdinCacheLoader::load(size_t fromByte, size_t toByte) {}
+
+
diff --git a/poppler/StdinCacheLoader.h b/poppler/StdinCacheLoader.h
new file mode 100644
index 0000000..3f07ba3
--- /dev/null
+++ b/poppler/StdinCacheLoader.h
@@ -0,0 +1,26 @@
+//========================================================================
+//
+// StdinCacheLoader.h
+//
+// This file is licensed under the GPLv2 or later
+//
+// Copyright 2010 Hib Eris <hib at hiberis.nl>
+//
+//========================================================================
+
+#ifndef STDINCACHELOADER_H
+#define STDINCACHELOADER_H
+
+#include "CachedFile.h"
+
+class StdinCacheLoader : public CacheLoader {
+
+public:
+
+  size_t init(GooString *dummy, CachedFile* cachedFile);
+  void load(size_t fromByte, size_t toByte);
+
+};
+
+#endif
+
-- 
1.6.3.3


From 6ded7f729ab41a1819144ff2366003a1840fb395 Mon Sep 17 00:00:00 2001
From: Hib Eris <hib at hiberis.nl>
Date: Wed, 24 Feb 2010 14:46:59 +0100
Subject: [PATCH 03/11] Use cached files to read from stdin in pdfinfo

This fixes reading from stdin.
---
 utils/pdfinfo.cc |    7 ++++++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/utils/pdfinfo.cc b/utils/pdfinfo.cc
index bfbe0b3..b56a8d8 100644
--- a/utils/pdfinfo.cc
+++ b/utils/pdfinfo.cc
@@ -15,6 +15,7 @@
 //
 // Copyright (C) 2006 Dom Lachowicz <cinamod at hotmail.com>
 // Copyright (C) 2007-2009 Albert Astals Cid <aacid at kde.org>
+// Copyright (C) 2010 Hib Eris <hib at hiberis.nl>
 //
 // To see a description of the changes please see the Changelog file that
 // came with your tarball or type make ChangeLog if you are building from git
@@ -47,6 +48,7 @@
 #include "PDFDocEncoding.h"
 #include "Error.h"
 #include "DateInfo.h"
+#include "StdinCacheLoader.h"
 
 static void printInfoString(Dict *infoDict, char *key, char *text,
 			    UnicodeMap *uMap);
@@ -164,7 +166,10 @@ int main(int argc, char *argv[]) {
       Object obj;
 
       obj.initNull();
-      doc = new PDFDoc(new FileStream(stdin, 0, gFalse, 0, &obj), ownerPW, userPW);
+      CachedFile *cachedFile = new CachedFile(new StdinCacheLoader(), NULL);
+      doc = new PDFDoc(new CachedFileStream(cachedFile, 0, gFalse, 0, &obj),
+                       ownerPW, userPW);
+      delete fileName;
   }
 
   if (userPW) {
-- 
1.6.3.3


From cb47414f4042402eb93605b1f677a6705415110b Mon Sep 17 00:00:00 2001
From: Hib Eris <hib at hiberis.nl>
Date: Tue, 23 Feb 2010 02:29:26 +0100
Subject: [PATCH 04/11] Add HTTP support using libcurl

With libcurl, poppler can handle documents over http.
---
 CMakeLists.txt              |   18 ++++++++
 config.h.cmake              |    6 +++
 configure.ac                |   20 ++++++++
 poppler-config.h.cmake      |    5 ++
 poppler/CurlCacheLoader.cc  |  101 +++++++++++++++++++++++++++++++++++++++++++
 poppler/CurlCacheLoader.h   |   39 ++++++++++++++++
 poppler/Makefile.am         |   20 ++++++++
 poppler/poppler-config.h.in |    5 ++
 8 files changed, 214 insertions(+), 0 deletions(-)
 create mode 100644 poppler/CurlCacheLoader.cc
 create mode 100644 poppler/CurlCacheLoader.h

diff --git a/CMakeLists.txt b/CMakeLists.txt
index 85bb98f..5b149c5 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -60,6 +60,7 @@ set(CAIRO_VERSION "1.8.4")
 macro_bool_to_01(ENABLE_SPLASH HAVE_SPLASH)
 find_package(Freetype REQUIRED)
 find_package(Fontconfig REQUIRED)
+macro_optional_find_package(CURL)
 macro_optional_find_package(JPEG)
 macro_optional_find_package(PNG)
 if(JPEG_FOUND)
@@ -121,6 +122,11 @@ endif(CMAKE_USE_PTHREADS_INIT)
 if(ENABLE_ZLIB)
   include_directories(${ZLIB_INCLUDE_DIR})
 endif(ENABLE_ZLIB)
+if(CURL_FOUND)
+  include_directories(${CURL_INCLUDE_DIR})
+  set(ENABLE_LIBCURL ON)
+  set(POPPLER_HAS_CURL_SUPPORT ON)
+endif(CURL_FOUND)
 if(JPEG_FOUND)
   include_directories(${JPEG_INCLUDE_DIR})
   set(ENABLE_LIBJPEG ON)
@@ -268,6 +274,12 @@ if(ENABLE_ZLIB)
   )
   set(poppler_LIBS ${poppler_LIBS} ${ZLIB_LIBRARIES})
 endif(ENABLE_ZLIB)
+if(CURL_FOUND)
+  set(poppler_SRCS ${poppler_SRCS}
+    poppler/CurlCacheLoader.cc
+  )
+  set(poppler_LIBS ${poppler_LIBS} ${CURL_LIBRARIES})
+endif(CURL_FOUND)
 if(LIBOPENJPEG_FOUND)
   set(poppler_SRCS ${poppler_SRCS}
     poppler/JPEG2000Stream.cc
@@ -397,6 +409,11 @@ if(ENABLE_XPDF_HEADERS)
     fofi/FoFiType1.h
     fofi/FoFiType1C.h
     DESTINATION include/poppler/fofi)
+  if(LIBCURL_FOUND)
+    install(FILES
+      poppler/CurlCacheLoader.h
+      DESTIONATION include/poppler)
+  endif(LIBCURL_FOUND)
   if(LIBOPENJPEG_FOUND)
     install(FILES
       poppler/JPEG2000Stream.h
@@ -505,6 +522,7 @@ show_end_message("cpp wrapper" ENABLE_CPP)
 show_end_message("use libjpeg" ENABLE_LIBJPEG)
 show_end_message("use libpng" ENABLE_LIBPNG)
 show_end_message("use zlib" ENABLE_ZLIB)
+show_end_message("use curl" ENABLE_LIBCURL)
 show_end_message("use libopenjpeg" LIBOPENJPEG_FOUND)
 show_end_message("use cms" USE_CMS)
 show_end_message("command line utils" ENABLE_UTILS)
diff --git a/config.h.cmake b/config.h.cmake
index 1253549..527cdf6 100644
--- a/config.h.cmake
+++ b/config.h.cmake
@@ -1,5 +1,8 @@
 /* config.h.  Generated from config.h.cmake by cmake.  */
 
+/* Build against libcurl. */
+#cmakedefine ENABLE_LIBCURL 1
+
 /* Use libjpeg instead of builtin jpeg decoder. */
 #cmakedefine ENABLE_LIBJPEG 1
 
@@ -153,6 +156,9 @@
 /* Poppler data dir */
 #define POPPLER_DATADIR "${CMAKE_INSTALL_PREFIX}/share/poppler"
 
+/* Support for curl based doc builder is compiled in. */
+#cmakedefine POPPLER_HAS_CURL_SUPPORT 1
+
 /* Have GDK */
 #cmakedefine POPPLER_WITH_GDK 1
 
diff --git a/configure.ac b/configure.ac
index 1f67c71..934be76 100644
--- a/configure.ac
+++ b/configure.ac
@@ -210,6 +210,25 @@ AM_CONDITIONAL(BUILD_ZLIB, test x$enable_zlib = xyes)
 AH_TEMPLATE([ENABLE_ZLIB],
 	    [Use zlib instead of builtin zlib decoder.])
 
+dnl Test for libcurl
+AC_ARG_ENABLE(libcurl,
+	      AC_HELP_STRING([--disable-libcurl],
+	                     [Do not build against libcurl.]),
+              enable_libcurl=$enableval,
+              enable_libcurl="try")
+
+if test x$enable_libcurl != xno; then
+  PKG_CHECK_MODULES(LIBCURL, libcurl, [enable_libcurl="yes"],
+      [enable_libcurl="no"])
+fi
+
+if test x$enable_libcurl = xyes; then
+  AC_DEFINE(ENABLE_LIBCURL, 1, [Build against libcurl.])
+  AC_DEFINE(POPPLER_HAS_CURL_SUPPORT, 1,
+     [Support for curl based doc builder is compiled in.])
+fi
+
+AM_CONDITIONAL(BUILD_LIBCURL, test x$enable_libcurl = xyes)
 
 dnl Test for libjpeg
 AC_ARG_ENABLE(libjpeg,
@@ -651,6 +670,7 @@ echo "  use gtk-doc:        $enable_gtk_doc"
 echo "  use libjpeg:        $enable_libjpeg"
 echo "  use libpng:         $enable_libpng"
 echo "  use zlib:           $enable_zlib"
+echo "  use libcurl:        $enable_libcurl"
 echo "  use libopenjpeg:    $enable_libopenjpeg"
 echo "  use cms:            $enable_cms"
 echo "  command line utils: $enable_utils"
diff --git a/poppler-config.h.cmake b/poppler-config.h.cmake
index 5122a4e..e5012f9 100644
--- a/poppler-config.h.cmake
+++ b/poppler-config.h.cmake
@@ -39,6 +39,11 @@
 #cmakedefine TEXTOUT_WORD_LIST 1
 #endif
 
+/* Support for curl is compiled in. */
+#ifndef POPPLER_HAS_CURL_SUPPORT
+#cmakedefine POPPLER_HAS_CURL_SUPPORT 1
+#endif
+
 // Also, there's a couple of preprocessor symbols in the header files
 // that are used but never defined: DISABLE_OUTLINE, DEBUG_MEM and
 
diff --git a/poppler/CurlCacheLoader.cc b/poppler/CurlCacheLoader.cc
new file mode 100644
index 0000000..15600ab
--- /dev/null
+++ b/poppler/CurlCacheLoader.cc
@@ -0,0 +1,101 @@
+//========================================================================
+//
+// CurlCacheLoader.cc
+//
+// This file is licensed under the GPLv2 or later
+//
+// Copyright 2009 Stefan Thomas <thomas at eload24.com>
+// Copyright 2010 Hib Eris <hib at hiberis.nl>
+//
+//========================================================================
+
+#include <config.h>
+
+#include "CurlCacheLoader.h"
+
+#include "Error.h"
+#include <curl/curl.h>
+#include <stdio.h>
+#include <string.h>
+
+//------------------------------------------------------------------------
+
+CurlCacheLoader::CurlCacheLoader()
+{
+  _url = NULL;
+  _cachedFile = NULL;
+  _curl = NULL;
+}
+
+CurlCacheLoader::~CurlCacheLoader() {
+  curl_easy_cleanup(_curl);
+}
+
+static size_t
+noop_cb(char *ptr, size_t size, size_t nmemb, void *ptr2)
+{
+  return size*nmemb;
+}
+
+size_t
+CurlCacheLoader::init(GooString *url, CachedFile *cachedFile)
+{
+  long code = NULL;
+  double contentLength = -1;
+  size_t size;
+
+  _url = url;
+  _cachedFile = cachedFile;
+  _curl = curl_easy_init();
+
+  curl_easy_setopt(_curl, CURLOPT_URL, _url->getCString());
+  curl_easy_setopt(_curl, CURLOPT_HEADER, 1);
+  curl_easy_setopt(_curl, CURLOPT_NOBODY, 1);
+  curl_easy_setopt(_curl, CURLOPT_WRITEFUNCTION, &noop_cb);
+  curl_easy_perform(_curl);
+  curl_easy_getinfo(_curl, CURLINFO_RESPONSE_CODE, &code);
+  curl_easy_getinfo(_curl, CURLINFO_CONTENT_LENGTH_DOWNLOAD, &contentLength);
+  curl_easy_reset(_curl);
+
+  size = contentLength;
+
+  return size;
+}
+
+typedef struct {
+  CachedFile *cachedFile;
+  size_t fromByte;
+} CURL_WRITE_DATA;
+
+static size_t
+write_cb(char *ptr, size_t size, size_t nmemb, CURL_WRITE_DATA *write_data)
+{
+   size_t written;
+
+   written = (write_data->cachedFile->write)
+                  (ptr, size*nmemb, write_data->fromByte);
+   write_data->fromByte += written;
+
+   return written;
+}
+
+void CurlCacheLoader::load(size_t fromByte, size_t toByte)
+{
+  GooString *range = GooString::format("{0:ud}-{1:ud}", fromByte, toByte);
+
+//printf("Range: %s\n", range->getCString());
+
+  CURL_WRITE_DATA write_data;
+  write_data.cachedFile = _cachedFile;
+  write_data.fromByte = fromByte;
+
+  curl_easy_setopt(_curl, CURLOPT_URL, _url->getCString());
+  curl_easy_setopt(_curl, CURLOPT_WRITEFUNCTION, &write_cb);
+  curl_easy_setopt(_curl, CURLOPT_WRITEDATA, &write_data);
+  curl_easy_setopt(_curl, CURLOPT_RANGE, range->getCString());
+  curl_easy_perform(_curl);
+  curl_easy_reset(_curl);
+}
+
+//------------------------------------------------------------------------
+
diff --git a/poppler/CurlCacheLoader.h b/poppler/CurlCacheLoader.h
new file mode 100644
index 0000000..2b02467
--- /dev/null
+++ b/poppler/CurlCacheLoader.h
@@ -0,0 +1,39 @@
+//========================================================================
+//
+// CurlCacheLoader.h
+//
+// This file is licensed under the GPLv2 or later
+//
+// Copyright 2010 Hib Eris <hib at hiberis.nl>
+//
+//========================================================================
+
+#ifndef CURLCACHELOADER_H
+#define CURLCACHELOADER_H
+
+#include "poppler-config.h"
+
+#include "goo/GooString.h"
+#include "CachedFile.h"
+
+#include <curl/curl.h>
+
+class CurlCacheLoader : public CacheLoader {
+
+public:
+
+  CurlCacheLoader();
+  ~CurlCacheLoader();
+  size_t init(GooString *url, CachedFile* cachedFile);
+  void load(size_t fromByte, size_t toByte);
+
+private:
+
+  GooString *_url;
+  CachedFile *_cachedFile;
+  CURL *_curl;
+
+};
+
+#endif
+
diff --git a/poppler/Makefile.am b/poppler/Makefile.am
index 699d084..8246db9 100644
--- a/poppler/Makefile.am
+++ b/poppler/Makefile.am
@@ -102,6 +102,22 @@ zlib_libs = 					\
 
 endif
 
+if BUILD_LIBCURL
+
+libcurl_libs =					\
+	$(LIBCURL_LIBS)
+
+libcurl_includes =				\
+	$(LIBCURL_CFLAGS)
+
+curl_headers =					\
+	CurlCacheLoader.h
+
+curl_sources =					\
+	CurlCacheLoader.cc
+
+endif
+
 if BUILD_ABIWORD_OUTPUT
 
 abiword_sources =				\
@@ -130,6 +146,7 @@ INCLUDES =					\
 	$(arthur_includes)			\
 	$(abiword_includes)			\
 	$(libpng_includes)			\
+	$(libcurl_includes)			\
 	$(FREETYPE_CFLAGS)			\
 	$(FONTCONFIG_CFLAGS)
 
@@ -149,6 +166,7 @@ libpoppler_la_LIBADD =				\
 	$(libjpeg_libs)				\
 	$(libpng_libs)				\
 	$(zlib_libs)				\
+	$(libcurl_libs)				\
 	$(libjpeg2000_libs)			\
 	$(abiword_libs)				\
 	$(FREETYPE_LIBS)			\
@@ -163,6 +181,7 @@ if ENABLE_XPDF_HEADERS
 poppler_includedir = $(includedir)/poppler
 poppler_include_HEADERS =	\
 	$(splash_headers)	\
+	$(curl_headers)		\
 	Annot.h			\
 	Array.h			\
 	BuiltinFont.h		\
@@ -236,6 +255,7 @@ libpoppler_la_SOURCES =		\
 	$(zlib_sources)		\
 	$(libjpeg2000_sources)	\
 	$(abiword_sources)	\
+	$(curl_sources)		\
 	Annot.cc		\
 	Array.cc 		\
 	BuiltinFont.cc		\
diff --git a/poppler/poppler-config.h.in b/poppler/poppler-config.h.in
index f8db4ba..7b0644c 100644
--- a/poppler/poppler-config.h.in
+++ b/poppler/poppler-config.h.in
@@ -49,6 +49,11 @@
 #undef WITH_FONTCONFIGURATION_WIN32
 #endif
 
+/* Support for curl is compiled in. */
+#ifndef POPPLER_HAS_CURL_SUPPORT
+#undef POPPLER_HAS_CURL_SUPPORT
+#endif
+
 // Also, there's a couple of preprocessor symbols in the header files
 // that are used but never defined: DISABLE_OUTLINE, DEBUG_MEM and
 
-- 
1.6.3.3


From 9eadca68d98f20c65942c6233a9e6b274edfeb32 Mon Sep 17 00:00:00 2001
From: Hib Eris <hib at hiberis.nl>
Date: Wed, 24 Feb 2010 15:45:44 +0100
Subject: [PATCH 05/11] Let pdfinfo read documents over HTTP

---
 utils/pdfinfo.cc |   16 +++++++++++++++-
 1 files changed, 15 insertions(+), 1 deletions(-)

diff --git a/utils/pdfinfo.cc b/utils/pdfinfo.cc
index b56a8d8..18ecf0a 100644
--- a/utils/pdfinfo.cc
+++ b/utils/pdfinfo.cc
@@ -49,6 +49,9 @@
 #include "Error.h"
 #include "DateInfo.h"
 #include "StdinCacheLoader.h"
+#if ENABLE_LIBCURL
+#include "CurlCacheLoader.h"
+#endif
 
 static void printInfoString(Dict *infoDict, char *key, char *text,
 			    UnicodeMap *uMap);
@@ -160,7 +163,18 @@ int main(int argc, char *argv[]) {
     userPW = NULL;
   }
 
-  if(fileName->cmp("-") != 0) {
+#if ENABLE_LIBCURL
+  if (fileName->cmpN("http://", 7) == 0 ||
+           fileName->cmpN("https://", 8) == 0) {
+      Object obj;
+
+      obj.initNull();
+      CachedFile *cachedFile = new CachedFile(new CurlCacheLoader(), fileName);
+      doc = new PDFDoc(new CachedFileStream(cachedFile, 0, gFalse, 0, &obj),
+                       ownerPW, userPW);
+  } else
+#endif
+  if (fileName->cmp("-") != 0) {
       doc = new PDFDoc(fileName, ownerPW, userPW);
   } else {
       Object obj;
-- 
1.6.3.3


From fd2ddb968d7fcedefb94d85f227cf87dc2b60d2d Mon Sep 17 00:00:00 2001
From: Hib Eris <hib at hiberis.nl>
Date: Tue, 23 Feb 2010 02:07:28 +0100
Subject: [PATCH 06/11] Add PDFDocBuilder

---
 CMakeLists.txt           |    2 +
 poppler/Makefile.am      |    2 +
 poppler/PDFDoc.cc        |   16 ++++++++++++++
 poppler/PDFDoc.h         |    3 ++
 poppler/PDFDocBuilder.cc |   48 ++++++++++++++++++++++++++++++++++++++++++++
 poppler/PDFDocBuilder.h  |   50 ++++++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 121 insertions(+), 0 deletions(-)
 create mode 100644 poppler/PDFDocBuilder.cc
 create mode 100644 poppler/PDFDocBuilder.h

diff --git a/CMakeLists.txt b/CMakeLists.txt
index 5b149c5..356f00d 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -219,6 +219,7 @@ set(poppler_SRCS
   poppler/PageTransition.cc
   poppler/Parser.cc
   poppler/PDFDoc.cc
+  poppler/PDFDocBuilder.cc
   poppler/PDFDocEncoding.cc
   poppler/PopplerCache.cc
   poppler/ProfileData.cc
@@ -356,6 +357,7 @@ if(ENABLE_XPDF_HEADERS)
     poppler/PageTransition.h
     poppler/Parser.h
     poppler/PDFDoc.h
+    poppler/PDFDocBuilder.h
     poppler/PDFDocEncoding.h
     poppler/PopplerCache.h
     poppler/ProfileData.h
diff --git a/poppler/Makefile.am b/poppler/Makefile.am
index 8246db9..4fc9c9d 100644
--- a/poppler/Makefile.am
+++ b/poppler/Makefile.am
@@ -218,6 +218,7 @@ poppler_include_HEADERS =	\
 	PageTransition.h	\
 	Parser.h		\
 	PDFDoc.h		\
+	PDFDocBuilder.h		\
 	PDFDocEncoding.h	\
 	PopplerCache.h		\
 	ProfileData.h		\
@@ -291,6 +292,7 @@ libpoppler_la_SOURCES =		\
 	PageTransition.cc	\
 	Parser.cc 		\
 	PDFDoc.cc 		\
+	PDFDocBuilder.cc	\
 	PDFDocEncoding.cc	\
 	PopplerCache.cc		\
 	ProfileData.cc		\
diff --git a/poppler/PDFDoc.cc b/poppler/PDFDoc.cc
index b088f6c..2293754 100644
--- a/poppler/PDFDoc.cc
+++ b/poppler/PDFDoc.cc
@@ -21,6 +21,7 @@
 // Copyright (C) 2009 Eric Toombs <ewtoombs at uwaterloo.ca>
 // Copyright (C) 2009 Kovid Goyal <kovid at kovidgoyal.net>
 // Copyright (C) 2009 Axel Struebing <axel.struebing at freenet.de>
+// Copyright (C) 2010 Hib Eris <hib at hiberis.nl>
 //
 // To see a description of the changes please see the Changelog file that
 // came with your tarball or type make ChangeLog if you are building from git
@@ -73,6 +74,21 @@
 // PDFDoc
 //------------------------------------------------------------------------
 
+PDFDoc::PDFDoc(GooString *fileNameA, int errorCode)
+{
+  file = NULL;
+  str = NULL;
+  xref = NULL;
+  catalog = NULL;
+#ifndef DISABLE_OUTLINE
+  outline = NULL;
+#endif
+
+  ok = gFalse;
+  fileName = fileNameA;
+  errCode = errorCode;
+}
+
 PDFDoc::PDFDoc(GooString *fileNameA, GooString *ownerPassword,
 	       GooString *userPassword, void *guiDataA) {
   Object obj;
diff --git a/poppler/PDFDoc.h b/poppler/PDFDoc.h
index 3db4b91..41a421c 100644
--- a/poppler/PDFDoc.h
+++ b/poppler/PDFDoc.h
@@ -20,6 +20,7 @@
 // Copyright (C) 2008 Carlos Garcia Campos <carlosgc at gnome.org>
 // Copyright (C) 2009 Eric Toombs <ewtoombs at uwaterloo.ca>
 // Copyright (C) 2009 Kovid Goyal <kovid at kovidgoyal.net>
+// Copyright (C) 2010 Hib Eris <hib at hiberis.nl>
 //
 // To see a description of the changes please see the Changelog file that
 // came with your tarball or type make ChangeLog if you are building from git
@@ -61,6 +62,8 @@ enum PDFWriteMode {
 class PDFDoc {
 public:
 
+  PDFDoc(GooString *fileNameA, int errorCode);
+
   PDFDoc(GooString *fileNameA, GooString *ownerPassword = NULL,
 	 GooString *userPassword = NULL, void *guiDataA = NULL);
 
diff --git a/poppler/PDFDocBuilder.cc b/poppler/PDFDocBuilder.cc
new file mode 100644
index 0000000..4d8e970
--- /dev/null
+++ b/poppler/PDFDocBuilder.cc
@@ -0,0 +1,48 @@
+//========================================================================
+//
+// PDFDocBuilder.cc
+//
+// This file is licensed under the GPLv2 or later
+//
+// Copyright 2010 Hib Eris <hib at hiberis.nl>
+//
+//========================================================================
+
+#include <config.h>
+
+#include "PDFDocBuilder.h"
+#include "ErrorCodes.h"
+
+//------------------------------------------------------------------------
+// PDFDocBuilderDecorator
+//------------------------------------------------------------------------
+
+PDFDocBuilderDecorator::PDFDocBuilderDecorator(PDFDocBuilder *pdfDocBuilder)
+{
+  _pdfDocBuilder = pdfDocBuilder;
+}
+
+PDFDocBuilderDecorator::~PDFDocBuilderDecorator()
+{
+  if (_pdfDocBuilder)
+  {
+    delete _pdfDocBuilder;
+  }
+}
+
+PDFDoc *
+PDFDocBuilderDecorator::BuildPDFDoc(GooString* uri, GooString *ownerPassword,
+                                    GooString *userPassword, void *guiDataA)
+{
+  if (!_pdfDocBuilder) {
+    error(-1, "Cannot handle URI '%s'.", uri->getCString());
+    GooString *fileName = new GooString(uri);
+    return new PDFDoc(fileName, errOpenFile);
+  }
+
+  return _pdfDocBuilder->BuildPDFDoc(
+                            uri, ownerPassword, userPassword, guiDataA);
+}
+
+
+
diff --git a/poppler/PDFDocBuilder.h b/poppler/PDFDocBuilder.h
new file mode 100644
index 0000000..e435639
--- /dev/null
+++ b/poppler/PDFDocBuilder.h
@@ -0,0 +1,50 @@
+//========================================================================
+//
+// PDFDocBuilder.h
+//
+// This file is licensed under the GPLv2 or later
+//
+// Copyright 2010 Hib Eris <hib at hiberis.nl>
+//
+//========================================================================
+
+#ifndef PDFDOCBUILDER_H
+#define PDFDOCBUILDER_H
+
+#include "PDFDoc.h"
+#include "goo/GooString.h"
+
+//------------------------------------------------------------------------
+// PDFDocBuilder
+//------------------------------------------------------------------------
+
+class PDFDocBuilder {
+
+public:
+
+  virtual ~PDFDocBuilder() {};
+  virtual PDFDoc *BuildPDFDoc(GooString* uri, GooString *ownerPassword = NULL,
+      GooString *userPassword = NULL, void *guiDataA = NULL) = 0;
+
+};
+
+//------------------------------------------------------------------------
+// PDFDocBuilderDecorator
+//------------------------------------------------------------------------
+
+class PDFDocBuilderDecorator : public PDFDocBuilder {
+
+public:
+
+  PDFDocBuilderDecorator(PDFDocBuilder* pdfDocBuilder);
+  virtual ~PDFDocBuilderDecorator();
+  virtual PDFDoc *BuildPDFDoc(GooString* uri, GooString *ownerPassword = NULL,
+      GooString *userPassword = NULL, void *guiDataA = NULL);
+
+private:
+
+  PDFDocBuilder *_pdfDocBuilder;
+
+};
+
+#endif /* PDFDOCBUILDER_H */
-- 
1.6.3.3


From b464b1b70275e222b602a980207e4cf06070e063 Mon Sep 17 00:00:00 2001
From: Hib Eris <hib at hiberis.nl>
Date: Tue, 23 Feb 2010 16:49:43 +0100
Subject: [PATCH 07/11] Add LocalPDFDocBuilder and StdinPDFDocBuilder

---
 CMakeLists.txt                |    4 +++
 poppler/LocalPDFDocBuilder.cc |   44 ++++++++++++++++++++++++++++++++++++++
 poppler/LocalPDFDocBuilder.h  |   42 ++++++++++++++++++++++++++++++++++++
 poppler/Makefile.am           |    4 +++
 poppler/StdinPDFDocBuilder.cc |   47 +++++++++++++++++++++++++++++++++++++++++
 poppler/StdinPDFDocBuilder.h  |   42 ++++++++++++++++++++++++++++++++++++
 6 files changed, 183 insertions(+), 0 deletions(-)
 create mode 100644 poppler/LocalPDFDocBuilder.cc
 create mode 100644 poppler/LocalPDFDocBuilder.h
 create mode 100644 poppler/StdinPDFDocBuilder.cc
 create mode 100644 poppler/StdinPDFDocBuilder.h

diff --git a/CMakeLists.txt b/CMakeLists.txt
index 356f00d..8a559b8 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -210,6 +210,7 @@ set(poppler_SRCS
   poppler/JBIG2Stream.cc
   poppler/Lexer.cc
   poppler/Link.cc
+  poppler/LocalPDFDocBuilder.cc
   poppler/NameToCharCode.cc
   poppler/Object.cc
   poppler/OptionalContent.cc
@@ -234,6 +235,7 @@ set(poppler_SRCS
   poppler/PageLabelInfo.cc
   poppler/SecurityHandler.cc
   poppler/StdinCacheLoader.cc
+  poppler/StdinPDFDocBuilder.cc
   poppler/Sound.cc
   poppler/XpdfPluginAPI.cc
   poppler/Movie.cc
@@ -347,6 +349,7 @@ if(ENABLE_XPDF_HEADERS)
     poppler/JBIG2Stream.h
     poppler/Lexer.h
     poppler/Link.h
+    poppler/LocalPDFDocBuilder.h
     poppler/Movie.h
     poppler/NameToCharCode.h
     poppler/Object.h
@@ -380,6 +383,7 @@ if(ENABLE_XPDF_HEADERS)
     poppler/TextOutputDev.h
     poppler/SecurityHandler.h
     poppler/StdinCacheLoader.h
+    poppler/StdinPDFDocBuilder.h
     poppler/UTF8.h
     poppler/XpdfPluginAPI.h
     poppler/Sound.h
diff --git a/poppler/LocalPDFDocBuilder.cc b/poppler/LocalPDFDocBuilder.cc
new file mode 100644
index 0000000..d62d74b
--- /dev/null
+++ b/poppler/LocalPDFDocBuilder.cc
@@ -0,0 +1,44 @@
+//========================================================================
+//
+// LocalPDFDocBuilder.cc
+//
+// This file is licensed under the GPLv2 or later
+//
+// Copyright 2010 Hib Eris <hib at hiberis.nl>
+//
+//========================================================================
+
+#include <config.h>
+
+#include "LocalPDFDocBuilder.h"
+
+//------------------------------------------------------------------------
+// LocalPDFDocBuilderDecorator
+//------------------------------------------------------------------------
+
+LocalPDFDocBuilderDecorator::LocalPDFDocBuilderDecorator(PDFDocBuilder *pdfDocBuilder) : PDFDocBuilderDecorator(pdfDocBuilder) {}
+
+PDFDoc *
+LocalPDFDocBuilderDecorator::BuildPDFDoc(
+    GooString* uri, GooString *ownerPassword, GooString *userPassword, void *guiDataA)
+{
+  if (uri->cmpN("file://", 7) == 0) {
+     GooString *fileName = new GooString(uri);
+     fileName->del(0, 7);
+     return new PDFDoc(fileName, ownerPassword, userPassword, guiDataA);
+  } else if (!strstr(uri->getCString(), "://")) {
+     GooString *fileName = new GooString(uri);
+     return new PDFDoc(fileName, ownerPassword, userPassword, guiDataA);
+  } else {
+     return PDFDocBuilderDecorator::BuildPDFDoc(
+                     uri, ownerPassword, userPassword, guiDataA);
+  }
+}
+
+//------------------------------------------------------------------------
+// LocalPDFDocBuilder
+//------------------------------------------------------------------------
+
+LocalPDFDocBuilder::LocalPDFDocBuilder()
+  : LocalPDFDocBuilderDecorator(NULL) {}
+
diff --git a/poppler/LocalPDFDocBuilder.h b/poppler/LocalPDFDocBuilder.h
new file mode 100644
index 0000000..bc1a264
--- /dev/null
+++ b/poppler/LocalPDFDocBuilder.h
@@ -0,0 +1,42 @@
+//========================================================================
+//
+// LocalPDFDocBuilder.h
+//
+// This file is licensed under the GPLv2 or later
+//
+// Copyright 2010 Hib Eris <hib at hiberis.nl>
+//
+//========================================================================
+
+#ifndef LOCALPDFDOCBUILDER_H
+#define LOCALPDFDOCBUILDER_H
+
+#include "PDFDocBuilder.h"
+
+//------------------------------------------------------------------------
+// LocalPDFDocBuilderDecorator
+//------------------------------------------------------------------------
+
+class LocalPDFDocBuilderDecorator : public PDFDocBuilderDecorator {
+
+public:
+
+  LocalPDFDocBuilderDecorator(PDFDocBuilder*);
+  PDFDoc *BuildPDFDoc(GooString* uri, GooString *ownerPassword = NULL,
+      GooString *userPassword = NULL, void *guiDataA = NULL);
+
+};
+
+//------------------------------------------------------------------------
+// LocalPDFDocBuilder
+//------------------------------------------------------------------------
+
+class LocalPDFDocBuilder : public LocalPDFDocBuilderDecorator {
+
+public:
+
+  LocalPDFDocBuilder();
+
+};
+
+#endif /* LOCALPDFDOCBUILDER_H */
diff --git a/poppler/Makefile.am b/poppler/Makefile.am
index 4fc9c9d..2a25b48 100644
--- a/poppler/Makefile.am
+++ b/poppler/Makefile.am
@@ -208,6 +208,7 @@ poppler_include_HEADERS =	\
 	JBIG2Stream.h		\
 	Lexer.h			\
 	Link.h			\
+	LocalPDFDocBuilder.h	\
 	Movie.h                 \
 	NameToCharCode.h	\
 	Object.h		\
@@ -225,6 +226,7 @@ poppler_include_HEADERS =	\
 	PreScanOutputDev.h	\
 	PSTokenizer.h		\
 	StdinCacheLoader.h	\
+	StdinPDFDocBuilder.h	\
 	Stream-CCITT.h		\
 	Stream.h		\
 	UnicodeMap.h		\
@@ -282,6 +284,7 @@ libpoppler_la_SOURCES =		\
 	JBIG2Stream.cc		\
 	Lexer.cc 		\
 	Link.cc 		\
+	LocalPDFDocBuilder.cc	\
 	Movie.cc                \
 	NameToCharCode.cc	\
 	Object.cc 		\
@@ -299,6 +302,7 @@ libpoppler_la_SOURCES =		\
 	PreScanOutputDev.cc \
 	PSTokenizer.cc		\
 	StdinCacheLoader.cc	\
+	StdinPDFDocBuilder.cc	\
 	Stream.cc 		\
 	UnicodeMap.cc		\
 	UnicodeTypeTable.cc	\
diff --git a/poppler/StdinPDFDocBuilder.cc b/poppler/StdinPDFDocBuilder.cc
new file mode 100644
index 0000000..9b21b9a
--- /dev/null
+++ b/poppler/StdinPDFDocBuilder.cc
@@ -0,0 +1,47 @@
+//========================================================================
+//
+// StdinPDFDocBuilder.cc
+//
+// This file is licensed under the GPLv2 or later
+//
+// Copyright 2010 Hib Eris <hib at hiberis.nl>
+//
+//========================================================================
+
+#include <config.h>
+
+#include "StdinPDFDocBuilder.h"
+#include "CachedFile.h"
+#include "StdinCacheLoader.h"
+
+//------------------------------------------------------------------------
+// StdinPDFDocBuilderDecorator
+//------------------------------------------------------------------------
+
+StdinPDFDocBuilderDecorator::StdinPDFDocBuilderDecorator(PDFDocBuilder *pdfDocBuilder) : PDFDocBuilderDecorator(pdfDocBuilder) {}
+
+PDFDoc *
+StdinPDFDocBuilderDecorator::BuildPDFDoc(GooString* uri, GooString *ownerPassword,
+                                    GooString *userPassword, void *guiDataA)
+{
+  if (uri->cmpN("fd://0", 6) == 0) {
+    Object obj;
+
+    obj.initNull();
+    CachedFile *cachedFile = new CachedFile(new StdinCacheLoader(), NULL);
+    return new PDFDoc(new CachedFileStream(cachedFile, 0, gFalse, 0, &obj),
+                                           ownerPassword, userPassword);
+  } else {
+     return PDFDocBuilderDecorator::BuildPDFDoc(
+                     uri, ownerPassword, userPassword, guiDataA);
+  }
+}
+
+//------------------------------------------------------------------------
+// StdinPDFDocBuilder
+//------------------------------------------------------------------------
+
+StdinPDFDocBuilder::StdinPDFDocBuilder()
+  : StdinPDFDocBuilderDecorator(NULL) {}
+
+
diff --git a/poppler/StdinPDFDocBuilder.h b/poppler/StdinPDFDocBuilder.h
new file mode 100644
index 0000000..ffa7fef
--- /dev/null
+++ b/poppler/StdinPDFDocBuilder.h
@@ -0,0 +1,42 @@
+//========================================================================
+//
+// StdinPDFDocBuilder.h
+//
+// This file is licensed under the GPLv2 or later
+//
+// Copyright 2010 Hib Eris <hib at hiberis.nl>
+//
+//========================================================================
+
+#ifndef STDINPDFDOCBUILDER_H
+#define STDINPDFDOCBUILDER_H
+
+#include "PDFDocBuilder.h"
+
+//------------------------------------------------------------------------
+// StdinPDFDocBuilderDecorator
+//------------------------------------------------------------------------
+
+class StdinPDFDocBuilderDecorator : public PDFDocBuilderDecorator {
+
+public:
+
+  StdinPDFDocBuilderDecorator(PDFDocBuilder*);
+  PDFDoc *BuildPDFDoc(GooString* uri, GooString *ownerPassword = NULL,
+      GooString *userPassword = NULL, void *guiDataA = NULL);
+
+};
+
+//------------------------------------------------------------------------
+// StdinPDFDocBuilder
+//------------------------------------------------------------------------
+
+class StdinPDFDocBuilder : public StdinPDFDocBuilderDecorator {
+
+public:
+
+    StdinPDFDocBuilder();
+
+};
+
+#endif /* STDINPDFDOCBUILDER_H */
-- 
1.6.3.3


From b91a7527da65bee686661561cfd03f255e5e245f Mon Sep 17 00:00:00 2001
From: Hib Eris <hib at hiberis.nl>
Date: Wed, 24 Feb 2010 11:18:36 +0100
Subject: [PATCH 08/11] Add PopplerPDFDocBuilder

---
 CMakeLists.txt                  |    2 ++
 poppler/Makefile.am             |    2 ++
 poppler/PopplerPDFDocBuilder.cc |   27 +++++++++++++++++++++++++++
 poppler/PopplerPDFDocBuilder.h  |   29 +++++++++++++++++++++++++++++
 4 files changed, 60 insertions(+), 0 deletions(-)
 create mode 100644 poppler/PopplerPDFDocBuilder.cc
 create mode 100644 poppler/PopplerPDFDocBuilder.h

diff --git a/CMakeLists.txt b/CMakeLists.txt
index 8a559b8..bffd64c 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -223,6 +223,7 @@ set(poppler_SRCS
   poppler/PDFDocBuilder.cc
   poppler/PDFDocEncoding.cc
   poppler/PopplerCache.cc
+  poppler/PopplerPDFDocBuilder.cc
   poppler/ProfileData.cc
   poppler/PreScanOutputDev.cc
   poppler/PSTokenizer.cc
@@ -363,6 +364,7 @@ if(ENABLE_XPDF_HEADERS)
     poppler/PDFDocBuilder.h
     poppler/PDFDocEncoding.h
     poppler/PopplerCache.h
+    poppler/PopplerPDFDocBuilder.h
     poppler/ProfileData.h
     poppler/PreScanOutputDev.h
     poppler/PSTokenizer.h
diff --git a/poppler/Makefile.am b/poppler/Makefile.am
index 2a25b48..493e03a 100644
--- a/poppler/Makefile.am
+++ b/poppler/Makefile.am
@@ -222,6 +222,7 @@ poppler_include_HEADERS =	\
 	PDFDocBuilder.h		\
 	PDFDocEncoding.h	\
 	PopplerCache.h		\
+	PopplerPDFDocBuilder.h	\
 	ProfileData.h		\
 	PreScanOutputDev.h	\
 	PSTokenizer.h		\
@@ -298,6 +299,7 @@ libpoppler_la_SOURCES =		\
 	PDFDocBuilder.cc	\
 	PDFDocEncoding.cc	\
 	PopplerCache.cc		\
+	PopplerPDFDocBuilder.cc	\
 	ProfileData.cc		\
 	PreScanOutputDev.cc \
 	PSTokenizer.cc		\
diff --git a/poppler/PopplerPDFDocBuilder.cc b/poppler/PopplerPDFDocBuilder.cc
new file mode 100644
index 0000000..20e8cb3
--- /dev/null
+++ b/poppler/PopplerPDFDocBuilder.cc
@@ -0,0 +1,27 @@
+//========================================================================
+//
+// PopplerPDFDocBuilder.cc
+//
+// This file is licensed under the GPLv2 or later
+//
+// Copyright 2010 Hib Eris <hib at hiberis.nl>
+//
+//========================================================================
+
+#include <config.h>
+
+#include "PopplerPDFDocBuilder.h"
+
+#include "StdinPDFDocBuilder.h"
+#include "LocalPDFDocBuilder.h"
+
+//------------------------------------------------------------------------
+// PopplerPDFDocBuilder
+//------------------------------------------------------------------------
+
+PopplerPDFDocBuilder::PopplerPDFDocBuilder() :
+  StdinPDFDocBuilderDecorator(
+    new LocalPDFDocBuilder()
+  )
+{}
+
diff --git a/poppler/PopplerPDFDocBuilder.h b/poppler/PopplerPDFDocBuilder.h
new file mode 100644
index 0000000..a4a8a86
--- /dev/null
+++ b/poppler/PopplerPDFDocBuilder.h
@@ -0,0 +1,29 @@
+//========================================================================
+//
+// PopplerPDFDocBuilder.h
+//
+// This file is licensed under the GPLv2 or later
+//
+// Copyright 2010 Hib Eris <hib at hiberis.nl>
+//
+//========================================================================
+
+#ifndef POPPLERPDFDOCBUILDER_H
+#define POPPLERPDFDOCBUILDER_H
+
+#include "PDFDocBuilder.h"
+#include "StdinPDFDocBuilder.h"
+
+//------------------------------------------------------------------------
+// PopplerPDFDocBuilder
+//------------------------------------------------------------------------
+
+class PopplerPDFDocBuilder : public StdinPDFDocBuilderDecorator {
+
+public:
+
+  PopplerPDFDocBuilder();
+
+};
+
+#endif /* POPPLERPDFDOCBUILDER_H */
-- 
1.6.3.3


From 413bf78001d3573acb33f7cf5e2b630a03eb624a Mon Sep 17 00:00:00 2001
From: Hib Eris <hib at hiberis.nl>
Date: Wed, 24 Feb 2010 15:24:26 +0100
Subject: [PATCH 09/11] Add CurlPDFDocBuilder

---
 CMakeLists.txt                  |    2 +
 poppler/CurlPDFDocBuilder.cc    |   52 +++++++++++++++++++++++++++++++++++++++
 poppler/CurlPDFDocBuilder.h     |   43 ++++++++++++++++++++++++++++++++
 poppler/Makefile.am             |    6 +++-
 poppler/PopplerPDFDocBuilder.cc |    9 ++++++
 5 files changed, 110 insertions(+), 2 deletions(-)
 create mode 100644 poppler/CurlPDFDocBuilder.cc
 create mode 100644 poppler/CurlPDFDocBuilder.h

diff --git a/CMakeLists.txt b/CMakeLists.txt
index bffd64c..860fb1b 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -281,6 +281,7 @@ endif(ENABLE_ZLIB)
 if(CURL_FOUND)
   set(poppler_SRCS ${poppler_SRCS}
     poppler/CurlCacheLoader.cc
+    poppler/CurlPDFDocBuilder.cc
   )
   set(poppler_LIBS ${poppler_LIBS} ${CURL_LIBRARIES})
 endif(CURL_FOUND)
@@ -420,6 +421,7 @@ if(ENABLE_XPDF_HEADERS)
   if(LIBCURL_FOUND)
     install(FILES
       poppler/CurlCacheLoader.h
+      poppler/CurlPDFDocBuilder.h
       DESTIONATION include/poppler)
   endif(LIBCURL_FOUND)
   if(LIBOPENJPEG_FOUND)
diff --git a/poppler/CurlPDFDocBuilder.cc b/poppler/CurlPDFDocBuilder.cc
new file mode 100644
index 0000000..c6a4bae
--- /dev/null
+++ b/poppler/CurlPDFDocBuilder.cc
@@ -0,0 +1,52 @@
+//========================================================================
+//
+// CurlPDFDocBuilder.cc
+//
+// This file is licensed under the GPLv2 or later
+//
+// Copyright 2010 Hib Eris <hib at hiberis.nl>
+//
+//========================================================================
+
+#include <config.h>
+
+#include "CurlPDFDocBuilder.h"
+
+#include "CachedFile.h"
+#include "CurlCacheLoader.h"
+
+//------------------------------------------------------------------------
+// CurlPDFDocBuilderDecorator
+//------------------------------------------------------------------------
+
+CurlPDFDocBuilderDecorator::CurlPDFDocBuilderDecorator(PDFDocBuilder *pdfDocBuilder)
+  : PDFDocBuilderDecorator(pdfDocBuilder) {};
+
+PDFDoc *
+CurlPDFDocBuilderDecorator::BuildPDFDoc(GooString* uri,
+        GooString *ownerPassword, GooString *userPassword, void *guiDataA)
+{
+  if (uri->cmpN("http://", 7) == 0 || uri->cmpN("https://", 8) == 0) {
+    Object obj;
+
+    CachedFile *cachedFile = new CachedFile(
+        new CurlCacheLoader(), new GooString(uri));
+
+    // create streamObject obj;
+    obj.initNull();
+
+    BaseStream *str = new CachedFileStream(cachedFile, 0, gFalse, 0, &obj);
+    return new PDFDoc(str, ownerPassword, userPassword, guiDataA);
+  } else {
+    return this->PDFDocBuilderDecorator::BuildPDFDoc(uri);
+  }
+}
+
+//------------------------------------------------------------------------
+// CurlPDFDocBuilder
+//------------------------------------------------------------------------
+
+CurlPDFDocBuilder::CurlPDFDocBuilder()
+  : CurlPDFDocBuilderDecorator(NULL) {}
+
+
diff --git a/poppler/CurlPDFDocBuilder.h b/poppler/CurlPDFDocBuilder.h
new file mode 100644
index 0000000..e441794
--- /dev/null
+++ b/poppler/CurlPDFDocBuilder.h
@@ -0,0 +1,43 @@
+//========================================================================
+//
+// CurlPDFDocBuilder.h
+//
+// This file is licensed under the GPLv2 or later
+//
+// Copyright 2010 Hib Eris <hib at hiberis.nl>
+//
+//========================================================================
+
+#ifndef CURLPDFDOCBUILDER_H
+#define CURLPDFDOCBUILDER_H
+
+#include "PDFDocBuilder.h"
+#include "goo/GooString.h"
+
+//------------------------------------------------------------------------
+// CurlPDFDocBuilderDecorator
+//------------------------------------------------------------------------
+
+class CurlPDFDocBuilderDecorator : public PDFDocBuilderDecorator {
+
+public:
+
+  CurlPDFDocBuilderDecorator(PDFDocBuilder *pdfDocBuilder);
+  virtual PDFDoc *BuildPDFDoc(GooString* uri, GooString *ownerPassword = NULL,
+      GooString *userPassword = NULL, void *guiDataA = NULL);
+
+};
+
+//------------------------------------------------------------------------
+// CurlPDFDocBuilder
+//------------------------------------------------------------------------
+
+class CurlPDFDocBuilder : public CurlPDFDocBuilderDecorator {
+
+public:
+
+  CurlPDFDocBuilder();
+
+};
+
+#endif /* CURLPDFDOCBUILDER_H */
diff --git a/poppler/Makefile.am b/poppler/Makefile.am
index 493e03a..1e8a2dc 100644
--- a/poppler/Makefile.am
+++ b/poppler/Makefile.am
@@ -111,10 +111,12 @@ libcurl_includes =				\
 	$(LIBCURL_CFLAGS)
 
 curl_headers =					\
-	CurlCacheLoader.h
+	CurlCacheLoader.h			\
+	CurlPDFDocBuilder.h
 
 curl_sources =					\
-	CurlCacheLoader.cc
+	CurlCacheLoader.cc			\
+	CurlPDFDocBuilder.cc
 
 endif
 
diff --git a/poppler/PopplerPDFDocBuilder.cc b/poppler/PopplerPDFDocBuilder.cc
index 20e8cb3..c6e3698 100644
--- a/poppler/PopplerPDFDocBuilder.cc
+++ b/poppler/PopplerPDFDocBuilder.cc
@@ -14,6 +14,9 @@
 
 #include "StdinPDFDocBuilder.h"
 #include "LocalPDFDocBuilder.h"
+#if ENABLE_LIBCURL
+#include "CurlPDFDocBuilder.h"
+#endif
 
 //------------------------------------------------------------------------
 // PopplerPDFDocBuilder
@@ -21,7 +24,13 @@
 
 PopplerPDFDocBuilder::PopplerPDFDocBuilder() :
   StdinPDFDocBuilderDecorator(
+#if ENABLE_LIBCURL
+  new CurlPDFDocBuilderDecorator(
+#endif
     new LocalPDFDocBuilder()
   )
+#if ENABLE_LIBCURL
+  )
+#endif
 {}
 
-- 
1.6.3.3


From fa4fa5d85f8db0480b990732e60345254624b213 Mon Sep 17 00:00:00 2001
From: Hib Eris <hib at hiberis.nl>
Date: Wed, 24 Feb 2010 21:15:40 +0100
Subject: [PATCH 10/11] Use PDFDocBuilder in utils

---
 utils/pdffonts.cc  |   16 ++++++++--------
 utils/pdfimages.cc |   11 ++++++++++-
 utils/pdfinfo.cc   |   30 ++++++------------------------
 utils/pdftoabw.cc  |   10 +++++++++-
 utils/pdftohtml.cc |   10 +++++++++-
 utils/pdftoppm.cc  |   15 +++++++++------
 utils/pdftops.cc   |    9 ++++++++-
 utils/pdftotext.cc |   14 +++++++-------
 8 files changed, 66 insertions(+), 49 deletions(-)

diff --git a/utils/pdffonts.cc b/utils/pdffonts.cc
index 752fa15..fe6970b 100644
--- a/utils/pdffonts.cc
+++ b/utils/pdffonts.cc
@@ -15,6 +15,7 @@
 //
 // Copyright (C) 2006 Dominic Lachowicz <cinamod at hotmail.com>
 // Copyright (C) 2007-2008 Albert Astals Cid <aacid at kde.org>
+// Copyright (C) 2010 Hib Eris <hib at hiberis.nl>
 //
 // To see a description of the changes please see the Changelog file that
 // came with your tarball or type make ChangeLog if you are building from git
@@ -38,6 +39,7 @@
 #include "GfxFont.h"
 #include "Annot.h"
 #include "PDFDoc.h"
+#include "PopplerPDFDocBuilder.h"
 
 static char *fontTypeNames[] = {
   "unknown",
@@ -131,16 +133,14 @@ int main(int argc, char *argv[]) {
   } else {
     userPW = NULL;
   }
-
-  if(fileName->cmp("-") != 0) {
-      doc = new PDFDoc(fileName, ownerPW, userPW);
-  } else {
-      Object obj;
-
-      obj.initNull();
-      doc = new PDFDoc(new FileStream(stdin, 0, gFalse, 0, &obj), ownerPW, userPW);
+  if (fileName->cmp("-") == 0) {
+      delete fileName;
+      fileName = new GooString("fd://0");
   }
 
+  doc = PopplerPDFDocBuilder().BuildPDFDoc(fileName, ownerPW, userPW);
+  delete fileName;
+
   if (userPW) {
     delete userPW;
   }
diff --git a/utils/pdfimages.cc b/utils/pdfimages.cc
index b821c79..51bcc79 100644
--- a/utils/pdfimages.cc
+++ b/utils/pdfimages.cc
@@ -16,6 +16,7 @@
 // under GPL version 2 or later
 //
 // Copyright (C) 2007-2008 Albert Astals Cid <aacid at kde.org>
+// Copyright (C) 2010 Hib Eris <hib at hiberis.nl>
 //
 // To see a description of the changes please see the Changelog file that
 // came with your tarball or type make ChangeLog if you are building from git
@@ -42,6 +43,7 @@
 #include "PDFDoc.h"
 #include "ImageOutputDev.h"
 #include "Error.h"
+#include "PopplerPDFDocBuilder.h"
 
 static int firstPage = 1;
 static int lastPage = 0;
@@ -120,7 +122,14 @@ int main(int argc, char *argv[]) {
   } else {
     userPW = NULL;
   }
-  doc = new PDFDoc(fileName, ownerPW, userPW);
+  if (fileName->cmp("-") == 0) {
+      delete fileName;
+      fileName = new GooString("fd://0");
+  }
+
+  doc = PopplerPDFDocBuilder().BuildPDFDoc(fileName, ownerPW, userPW);
+  delete fileName;
+
   if (userPW) {
     delete userPW;
   }
diff --git a/utils/pdfinfo.cc b/utils/pdfinfo.cc
index 18ecf0a..5b667b4 100644
--- a/utils/pdfinfo.cc
+++ b/utils/pdfinfo.cc
@@ -48,10 +48,7 @@
 #include "PDFDocEncoding.h"
 #include "Error.h"
 #include "DateInfo.h"
-#include "StdinCacheLoader.h"
-#if ENABLE_LIBCURL
-#include "CurlCacheLoader.h"
-#endif
+#include "PopplerPDFDocBuilder.h"
 
 static void printInfoString(Dict *infoDict, char *key, char *text,
 			    UnicodeMap *uMap);
@@ -163,29 +160,14 @@ int main(int argc, char *argv[]) {
     userPW = NULL;
   }
 
-#if ENABLE_LIBCURL
-  if (fileName->cmpN("http://", 7) == 0 ||
-           fileName->cmpN("https://", 8) == 0) {
-      Object obj;
-
-      obj.initNull();
-      CachedFile *cachedFile = new CachedFile(new CurlCacheLoader(), fileName);
-      doc = new PDFDoc(new CachedFileStream(cachedFile, 0, gFalse, 0, &obj),
-                       ownerPW, userPW);
-  } else
-#endif
-  if (fileName->cmp("-") != 0) {
-      doc = new PDFDoc(fileName, ownerPW, userPW);
-  } else {
-      Object obj;
-
-      obj.initNull();
-      CachedFile *cachedFile = new CachedFile(new StdinCacheLoader(), NULL);
-      doc = new PDFDoc(new CachedFileStream(cachedFile, 0, gFalse, 0, &obj),
-                       ownerPW, userPW);
+  if (fileName->cmp("-") == 0) {
       delete fileName;
+      fileName = new GooString("fd://0");
   }
 
+  doc = PopplerPDFDocBuilder().BuildPDFDoc(fileName, ownerPW, userPW);
+  delete fileName;
+
   if (userPW) {
     delete userPW;
   }
diff --git a/utils/pdftoabw.cc b/utils/pdftoabw.cc
index 9c71c76..4c0287c 100644
--- a/utils/pdftoabw.cc
+++ b/utils/pdftoabw.cc
@@ -4,6 +4,7 @@
  * Copyright (C) 2007 Kouhei Sutou <kou at cozmixng.org>
  * Copyright (C) 2009 Jakub Wilk <ubanus at users.sf.net>
  * Copyright (C) 2009 Albert Astals Cid <aacid at kde.org>
+ * Copyright (C) 2010 Hib Eris <hib at hiberis.nl>
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License as published by
@@ -48,6 +49,7 @@
 #include "goo/gfile.h"
 #include <libxml/parser.h>
 #include <libxml/tree.h>
+#include "PopplerPDFDocBuilder.h"
 
 static int firstPage = 1;
 static int lastPage = 0;
@@ -136,7 +138,13 @@ int main(int argc, char *argv[]) {
     userPW = NULL;
   }
 
-  doc = new PDFDoc(fileName, ownerPW, userPW);
+  if (fileName->cmp("-") == 0) {
+      delete fileName;
+      fileName = new GooString("fd://0");
+  }
+
+  doc = PopplerPDFDocBuilder().BuildPDFDoc(fileName, ownerPW, userPW);
+  delete fileName;
 
   if (userPW) {
     delete userPW;
diff --git a/utils/pdftohtml.cc b/utils/pdftohtml.cc
index 41312de..16eaa74 100644
--- a/utils/pdftohtml.cc
+++ b/utils/pdftohtml.cc
@@ -14,6 +14,7 @@
 // under GPL version 2 or later
 //
 // Copyright (C) 2007-2008 Albert Astals Cid <aacid at kde.org>
+// Copyright (C) 2010 Hib Eris <hib at hiberis.nl>
 //
 // To see a description of the changes please see the Changelog file that
 // came with your tarball or type make ChangeLog if you are building from git
@@ -47,6 +48,7 @@
 #include "Error.h"
 #include "DateInfo.h"
 #include "goo/gfile.h"
+#include "PopplerPDFDocBuilder.h"
 
 #ifndef GHOSTSCRIPT
 # define GHOSTSCRIPT "gs"
@@ -187,7 +189,13 @@ int main(int argc, char *argv[]) {
 
   fileName = new GooString(argv[1]);
 
-  doc = new PDFDoc(fileName, ownerPW, userPW);
+  if (fileName->cmp("-") == 0) {
+      delete fileName;
+      fileName = new GooString("fd://0");
+  }
+
+  doc = PopplerPDFDocBuilder().BuildPDFDoc(fileName, ownerPW, userPW);
+
   if (userPW) {
     delete userPW;
   }
diff --git a/utils/pdftoppm.cc b/utils/pdftoppm.cc
index 5b318be..d2970ae 100644
--- a/utils/pdftoppm.cc
+++ b/utils/pdftoppm.cc
@@ -20,6 +20,7 @@
 // Copyright (C) 2009 Stefan Thomas <thomas at eload24.com>
 // Copyright (C) 2009, 2010 Albert Astals Cid <aacid at kde.org>
 // Copyright (C) 2010 Adrian Johnson <ajohnson at redneon.com>
+// Copyright (C) 2010 Hib Eris <hib at hiberis.nl>
 //
 // To see a description of the changes please see the Changelog file that
 // came with your tarball or type make ChangeLog if you are building from git
@@ -39,6 +40,7 @@
 #include "splash/SplashBitmap.h"
 #include "splash/Splash.h"
 #include "SplashOutputDev.h"
+#include "PopplerPDFDocBuilder.h"
 
 #define PPM_FILE_SZ 512
 
@@ -250,14 +252,15 @@ int main(int argc, char *argv[]) {
   } else {
     userPW = NULL;
   }
-  if(fileName != NULL && fileName->cmp("-") != 0) {
-      doc = new PDFDoc(fileName, ownerPW, userPW);
-  } else {
-      Object obj;
 
-      obj.initNull();
-      doc = new PDFDoc(new FileStream(stdin, 0, gFalse, 0, &obj), ownerPW, userPW);
+  if (fileName->cmp("-") == 0) {
+      delete fileName;
+      fileName = new GooString("fd://0");
   }
+
+  doc = PopplerPDFDocBuilder().BuildPDFDoc(fileName, ownerPW, userPW);
+  delete fileName;
+
   if (userPW) {
     delete userPW;
   }
diff --git a/utils/pdftops.cc b/utils/pdftops.cc
index 69d5c32..96b7d88 100644
--- a/utils/pdftops.cc
+++ b/utils/pdftops.cc
@@ -46,6 +46,7 @@
 #include "PDFDoc.h"
 #include "PSOutputDev.h"
 #include "Error.h"
+#include "PopplerPDFDocBuilder.h"
 
 static GBool setPSPaperSize(char *size, int &psPaperWidth, int &psPaperHeight) {
   if (!strcmp(size, "match")) {
@@ -299,7 +300,13 @@ int main(int argc, char *argv[]) {
   } else {
     userPW = NULL;
   }
-  doc = new PDFDoc(fileName, ownerPW, userPW);
+  if (fileName->cmp("-") == 0) {
+      delete fileName;
+      fileName = new GooString("fd://0");
+  }
+
+  doc = PopplerPDFDocBuilder().BuildPDFDoc(fileName, ownerPW, userPW);
+
   if (userPW) {
     delete userPW;
   }
diff --git a/utils/pdftotext.cc b/utils/pdftotext.cc
index 4ebda19..563c77e 100644
--- a/utils/pdftotext.cc
+++ b/utils/pdftotext.cc
@@ -18,6 +18,7 @@
 // Copyright (C) 2006 Dominic Lachowicz <cinamod at hotmail.com>
 // Copyright (C) 2007-2008 Albert Astals Cid <aacid at kde.org>
 // Copyright (C) 2009 Jan Jockusch <jan at jockusch.de>
+// Copyright (C) 2010 Hib Eris <hib at hiberis.nl>
 //
 // To see a description of the changes please see the Changelog file that
 // came with your tarball or type make ChangeLog if you are building from git
@@ -47,6 +48,7 @@
 #include "CharTypes.h"
 #include "UnicodeMap.h"
 #include "Error.h"
+#include "PopplerPDFDocBuilder.h"
 
 static void printInfoString(FILE *f, Dict *infoDict, char *key,
 			    char *text1, char *text2, UnicodeMap *uMap);
@@ -192,15 +194,13 @@ int main(int argc, char *argv[]) {
     userPW = NULL;
   }
 
-  if(fileName->cmp("-") != 0) {
-      doc = new PDFDoc(fileName, ownerPW, userPW);
-  } else {
-      Object obj;
-
-      obj.initNull();
-      doc = new PDFDoc(new FileStream(stdin, 0, gFalse, 0, &obj), ownerPW, userPW);
+  if (fileName->cmp("-") == 0) {
+      delete fileName;
+      fileName = new GooString("fd://0");
   }
 
+  doc = PopplerPDFDocBuilder().BuildPDFDoc(fileName, ownerPW, userPW);
+
   if (userPW) {
     delete userPW;
   }
-- 
1.6.3.3


From 006f7caa12044fea193b82e8725f88b48e68cb5f Mon Sep 17 00:00:00 2001
From: Hib Eris <hib at hiberis.nl>
Date: Wed, 24 Feb 2010 20:27:29 +0100
Subject: [PATCH 11/11] Initialize variable in TextOutputDev

---
 poppler/TextOutputDev.cc |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/poppler/TextOutputDev.cc b/poppler/TextOutputDev.cc
index 3b16622..9963d55 100644
--- a/poppler/TextOutputDev.cc
+++ b/poppler/TextOutputDev.cc
@@ -4626,6 +4626,7 @@ TextOutputDev::TextOutputDev(char *fileName, GBool physLayoutA,
   rawOrder = rawOrderA;
   doHTML = gFalse;
   ok = gTrue;
+  actualText = NULL;
 
   // open file
   needClose = gFalse;
-- 
1.6.3.3


More information about the poppler mailing list