[poppler] [PATCH] Fixup LaTeX composed characters

Tim Brody tdb2 at ecs.soton.ac.uk
Mon Mar 28 02:56:00 PDT 2011


On Fri, 2011-03-25 at 20:43 +0000, Albert Astals Cid wrote:
> A Divendres, 25 de març de 2011, vàreu escriure:
> > On Fri, 25 Mar 2011 19:02:46 +0000, Albert Astals Cid <aacid at kde.org>
> > 
> > wrote:
> > > A Divendres, 25 de març de 2011, Tim Brody va escriure:
> > >> Hi All,
> > >> 
> > >> Attached is a patch to address the previous problem I wrote about with
> > >> pdflatex-produced PDFs that contain overlapping-diacritics/accents.

> > > Hmmm, is it supposed to just kill the diacritic mark?
> > > 
> > > R. L¨wen and B. Polster
> > > o
> > > gets converted to
> > > R. Lowen and B. Polster
> > > shouldn't it be
> > > R. Löwen and B. Polster
> > > ?
> > 
> > It should do - can you send me this PDF?
> 
> http://www.maths.mq.edu.au/~ross/5019-e-cmap.pdf

This PDF has [combining character][character to combine with].

I've added combining-chars to the equiv-mapping table which appears to
work for this PDF:

"R. Löwen and B. Polster"
"Institut für Analysis und Algebra ..."

> > 
> > I get this from TeX:
> > R. L\"owen and B. Polster => R. Löwen and B. Polster
> > 
> > NB I just tried extracting from a Word-generated PDF and TextOutputDev
> > didn't see the line with the diacritic at all.
> 
> And are you sure it's not a Word fault?

Oh I expect it is but I thought I'd mention it. I'll try to investigate
further.

/Tim.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Turn-TeX-style-composed-characters-into-Unicode-comb.patch
Type: text/x-patch
Size: 7449 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/poppler/attachments/20110328/d6d7f8e8/attachment.bin>


More information about the poppler mailing list