[poppler] Vertical or horizontal writing?
Albert Astals Cid
aacid at kde.org
Wed Sep 1 11:01:40 PDT 2010
A Diumenge, 15 d'agost de 2010, mpsuzuki at hiroshima-u.ac.jp va escriure:
> On Sat, 14 Aug 2010 21:18:56 +0100
>
> Albert Astals Cid <aacid at kde.org> wrote:
> >A Dissabte, 31 de juliol de 2010, mpsuzuki at hiroshima-u.ac.jp va escriure:
> >> Sorry for a silence in a while. Checking the source,
> >> I found following points.
> >>
> >> 1) poppler-qt4 page object issue
> >>
> >> On the other hand, getText() is device specific method,
> >> only in TextOutputDev.cc, so changing getText() is
> >> easier.
> >>
> >> 2) TextOutputDev::getText() issue
> >>
> >> I think, raw-ordered text from MS Office's tricky vertical
> >> text can be applicable for text search, but physically-
> >> layouted text cannot be applicable for text search.
> >
> >WoW, that's a huge mail :D
>
> Sorry, my post was too lengthy to find what is my proposal
> to poppler maintainers.
>
> >So my understanding is that "proper" CJK searching is a lot
> >of work and you advocate for just exposing the raw text to
> >the upper layers (users of poppler-qt4) so they can do the
> >work if they need it?
>
> Yes. I think exposing the raw text to the upper layers would
> be the reasonable starting point for various non-left-to-right
> scripts, because it is script-independent.
>
> # about the insertion of the space (U+0020) between the words,
> # still I've not decided what is good.
I don't think this makes sense, if we are being raw, we should be raw, and
adding a space that is not there is not being raw.
So if you agree on not adding the space i will commit your patch.
Albert
>
> Also I've written a preliminary patch to modify TextPage::findText()
> in TextOutputDev to support the device created in rawOrder mode
> (if required, I will post here). Now I'm waiting for Cobra's feedback
> to see if it works for his purpose.
>
> Regards,
> mpsuzuki
> _______________________________________________
> poppler mailing list
> poppler at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/poppler
More information about the poppler
mailing list