[poppler] -bbox option in pdftotext

Albert Astals Cid aacid at kde.org
Sat Feb 19 03:21:39 PST 2011


A Divendres, 18 de febrer de 2011, Tom Gleason va escriure:
> Hi,
> 
> The -bbox option is great, but there are two problems that I've noticed.
> 
> First, the <body> and <html> tags aren't closed, which cause a problem
> for parsing the xml:
> 
> --- a/utils/pdftotext.cc
> +++ b/utils/pdftotext.cc
> @@ -361,6 +361,8 @@ int main(int argc, char *argv[]) {
>        }
>        fprintf(f, "</doc>\n");
>      }
> +    fprintf(f, "</body>\n");
> +    fprintf(f, "</html>\n");
>      fclose(f);
>      delete textOut;
>    } else {

This is not the correct fix. I've added the correct fix to the git repo and it 
will be released with poppler 0.16.3

Albert

> 
> 
> second, though the program outputs the data fine, I get a segmentation
> fault. I'm not a C programmer so I'm not sure how to debug this
> 
> tom at tom:~/Desktop$ pdftotext -bbox thrift-20070401.pdf
> Segmentation fault
> 
> 
> Hope that helps.


More information about the poppler mailing list