[poppler] Vertical or horizontal writing?

Deri James deri at chuzzlewit.demon.co.uk
Tue Jul 27 09:22:14 PDT 2010


On Tuesday 27 July 2010 13:35:43 mpsuzuki at hiroshima-u.ac.jp wrote:
> But, Cobra had found the font-level writing mode detection
> is insufficient even we restrict the scope to the PDF
> generated by popular applications. I attached a PDF
> including vertical text which is generated by MS Office
> 2010 PDF generator addin. The embedded font is connected
> with Identity-H, so my patch recognizes the font is for
> horizontal. I try to detect the expected result by using
> text level information. So, please don't hurry to evaluate
> this patch. I mush work more.

When looking at the two PDFs you are using with acroread using the text 
selection tool:-

P1 of 'vert-horiz-ipa-std.pdf' selection caret is drawn horizontally.
'msword2010-vert2.pdf' selection caret is drawn vertically.

So, it seems acroread can't detect the vertical text in this file, i.e. it is 
actually horizontal text placed one glyph at a time (apart from 'MS Word 2010' 
which is horizontal text rotated 90 degrees).

The contents of the stream confirms this:-

stream
 /P <</MCID 0/Lang (en-US)>> BDC BT
/F1 10.56 Tf
0.000000001 -1 1 0.000000001 496.54 756.84 Tm
0 g
0 G
[(MS)6( )5(W)61(ord)-4( )5(20)10(10)] TJ
ET
 EMC  /P <</MCID 1>> BDC BT
/F2 10.56 Tf
1 0.000000017 -0.000000017 1 495.29 673.7 Tm
<085B>Tj
ET
 EMC  /P <</MCID 2>> BDC BT
1 0.000000017 -0.000000017 1 495.29 663.14 Tm
<29AA>Tj
0 -10.44 TD
<1B69>Tj
ET
 EMC  /P <</MCID 3>> BDC BT
1 0.000000017 -0.000000017 1 495.29 642.14 Tm
<0841>Tj
ET
 EMC  /P <</MCID 4>> BDC BT
1 0.000000017 -0.000000017 1 495.29 631.7 Tm
<0862>Tj
ET
 EMC  /P <</MCID 5>> BDC BT
1 0.000000017 -0.000000017 1 495.29 621.14 Tm
<08B8>Tj
0 -10.56 TD
<08AB>Tj
0 -10.44 TD
<08BA>Tj
ET
...

So this PDF does not have any true vertical text.

Cheers

Deri


More information about the poppler mailing list