[Libreoffice-bugs] [Bug 104597] Text runs of RTL scripts (e.g. Arabic, Hebrew, Persian) from imported PDF are reversed, PDFIProcessor::mirrorString not behaving
bugzilla-daemon at bugs.documentfoundation.org
bugzilla-daemon at bugs.documentfoundation.org
Thu Jul 15 10:54:14 UTC 2021
https://bugs.documentfoundation.org/show_bug.cgi?id=104597
--- Comment #46 from Kevin Suo <suokunlong at 126.com> ---
Further info:
If xpdf generated the following output:
drawChar 462.400000 770.989000 466.900000 770.989000 1.000000 0.000000 0.000000
1.000000 12.000000 ة
then sdext pdfimport will produce the following Transformation values, in the
order in the metrics: (assume this is the rCurGC):
(0,0) (0,1) (0,2) (1,0) (1,1) (1,2)
---------------------------------------------------------------------
rCurGC: 1200 0 46240 0 1200 5674.08
If xpdf generated the following:
drawChar 466.900000 770.989000 469.828000 770.989000 1.000000 0.000000 0.000000
1.000000 12.000000 ي
then in sdext pdfimport the Transformation values are: (assume this is the
rNextGC):
(0,0) (0,1) (0,2) (1,0) (1,1) (1,2)
---------------------------------------------------------------------
rNextGC: 1200 0 46690 0 1200 5674.08
Apparently rCurGC.Transformation != rNextGC.Transformation. The different is in
the (0,2): one is 46240, the other one is 46690. What are these two values? the
position of the characters on the page??
Below is the full output of rCurGC.Transformation and rNextGC.Transformation:
(this is generated by adding a SAR_WARN below line
aGlyph.getGC().Transformation = totalTextMatrix1;
in file pdfiprocessor.cxx:
SAL_WARN("sdext.pdfimport", "drawGlyphs: "
<< aGlyph.getGC().Transformation.get(0,0) << ", "
<< aGlyph.getGC().Transformation.get(0,1) << ", "
<< aGlyph.getGC().Transformation.get(0,2) << ", "
<< aGlyph.getGC().Transformation.get(1,0) << ", "
<< aGlyph.getGC().Transformation.get(1,1) << ", "
<< aGlyph.getGC().Transformation.get(1,2) << ", "
);
rCurGC: 1200 0 46240 0 1200 5674.08
rCurGC: 1200 0 46690 0 1200 5674.08
rCurGC: 1200 0 46980.4 0 1200 5674.08
rCurGC: 1200 0 47070.4 0 1200 5674.08
rCurGC: 1200 0 47659.6 0 1200 5674.08
rCurGC: 1200 0 48130 0 1200 5674.08
rCurGC: 1200 0 48370 0 1200 5674.08
rCurGC: 1200 0 48618.4 0 1200 5674.08
rCurGC: 1200 0 49007.2 0 1200 5674.08
rCurGC: 1200 0 49637.2 0 1200 5674.08
rCurGC: 1200 0 49927.6 0 1200 5674.08
rCurGC: 1200 0 50218 0 1200 5674.08
rCurGC: 1200 0 50496.4 0 1200 5674.08
rCurGC: 1200 0 50806 0 1200 5674.08
rCurGC: 1200 0 51276.4 0 1200 5674.08
rCurGC: 1200 0 51524.8 0 1200 5674.08
rCurGC: 1200 0 51773.2 0 1200 5674.08
rCurGC: 1200 0 52153.6 0 1200 5674.08
rCurGC: 1200 0 52603.6 0 1200 5674.08
rCurGC: 1200 0 53093.2 0 1200 5674.08
As you can see, all other values are the same, but the value in position (0,1)
of the metrics is increasing one by one. I think this value should not be used
to determine whether these characters should be combined into a string.
This is beyond my knowledge as it involves the basegfx::B2DHomMatrix staff
which I know nothing, so it need an expert to investigate.
--
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/libreoffice-bugs/attachments/20210715/9b63b6b8/attachment.htm>
More information about the Libreoffice-bugs
mailing list