[poppler] Could not parse charref for nameToUnicode errors
jonathan_kew at sil.org
Wed Dec 19 06:11:07 PST 2007
On 19 Dec 2007, at 12:06 pm, Adrian Johnson wrote:
> I've created a test file to test the patch
> The numbers "1", "2", and "3", are mapped to the text "test", "text",
> and "the". The "Z" has the glyph name "g1" so it should be ignored
> extracting text.
> I have found a bug in the code. With the test file I get
> $ pdftotext test.pdf -
> Error: Could not parse charref for nameToUnicode: g1
> This is = test of text extr=?tion using the glyph n=mes
> The output should be:
> This is a test of text extraction using the glyph names
> It looks like the glyph names "u00061" and "u0063" are not decoded
To be more specific, it looks as though the names are being
interpreted as decimal rather than hexadecimal.
Could it be that some implementations of sscanf require an 0x prefix
to scan hex, and otherwise treat the value as decimal?
More information about the poppler