[Libreoffice-bugs] [Bug 104597] Text runs of RTL scripts (e.g. Arabic, Hebrew, Persian) from imported PDF are reversed, PDFIProcessor::mirrorString not behaving

bugzilla-daemon at bugs.documentfoundation.org bugzilla-daemon at bugs.documentfoundation.org
Sun Jul 18 14:09:31 UTC 2021


https://bugs.documentfoundation.org/show_bug.cgi?id=104597

--- Comment #65 from V Stuart Foote <vstuart.foote at utsa.edu> ---
@Armin, Kevin, *

(In reply to Armin Le Grand from comment #64)
> ... It may even be that this needs
> adaption e.g. vertical decisions for chinese...? I do not know. Also
> possible that spaces may need extra caveat due to having eventually
> different FontSize?

This is great work!  But ultimately doesn't this require more finesse than
simply testing presence of a "gap" to end concatenation and signify the next
span to be passed out to mirror? And guess the off baseline positioning of
comment 61 might need to be handled.

When concatenating with color, font or transformation check as now, what
happens at pucntuations, or kashida internal to the imported text run?  Or
brackets/parenthesis that should bound the span--does the opening and closing
match the script?  

Restoring the transformation tests in drawtreevisiting.cxx will close this
issue. But would think testing against full range of script appropriate ICU
word break iterators could be the trigger to end of the text run concatenation
being passed to the mirror string action. 

Also maybe test for sentence break iterators?  Beside acting as a word bound,
are they kept with the span being mirrored? And do they end up placed
appropriately for the RTL scripts?

Maybe including logic to test enclosing parenthesis or bracketing--to get the
beginning and ending in the correct position.

The ICU BiDi libs were not very robust when the import filter was laid down in
OOo era for i90800 (see also).

Also, don't similar things need to be done for RTL the Writer PDF import filter
(writertreevisiting.cxx/.hxx)?

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/libreoffice-bugs/attachments/20210718/1f9bde51/attachment.htm>


More information about the Libreoffice-bugs mailing list