[poppler] -bbox option in pdftotext
Albert Astals Cid
aacid at kde.org
Sat Feb 19 03:21:39 PST 2011
A Divendres, 18 de febrer de 2011, Tom Gleason va escriure:
> Hi,
>
> The -bbox option is great, but there are two problems that I've noticed.
>
> First, the <body> and <html> tags aren't closed, which cause a problem
> for parsing the xml:
>
> --- a/utils/pdftotext.cc
> +++ b/utils/pdftotext.cc
> @@ -361,6 +361,8 @@ int main(int argc, char *argv[]) {
> }
> fprintf(f, "</doc>\n");
> }
> + fprintf(f, "</body>\n");
> + fprintf(f, "</html>\n");
> fclose(f);
> delete textOut;
> } else {
This is not the correct fix. I've added the correct fix to the git repo and it
will be released with poppler 0.16.3
Albert
>
>
> second, though the program outputs the data fine, I get a segmentation
> fault. I'm not a C programmer so I'm not sure how to debug this
>
> tom at tom:~/Desktop$ pdftotext -bbox thrift-20070401.pdf
> Segmentation fault
>
>
> Hope that helps.
More information about the poppler
mailing list