[poppler] Vertical or horizontal writing?

Albert Astals Cid aacid at kde.org
Wed Sep 1 11:01:40 PDT 2010


A Diumenge, 15 d'agost de 2010, mpsuzuki at hiroshima-u.ac.jp va escriure:
> On Sat, 14 Aug 2010 21:18:56 +0100
> 
> Albert Astals Cid <aacid at kde.org> wrote:
> >A Dissabte, 31 de juliol de 2010, mpsuzuki at hiroshima-u.ac.jp va escriure:
> >> Sorry for a silence in a while. Checking the source,
> >> I found following points.
> >> 
> >> 1) poppler-qt4 page object issue
> >> 
> >> On the other hand, getText() is device specific method,
> >> only in TextOutputDev.cc, so changing getText() is
> >> easier.
> >> 
> >> 2) TextOutputDev::getText() issue
> >> 
> >> I think, raw-ordered text from MS Office's tricky vertical
> >> text can be applicable for text search, but physically-
> >> layouted text cannot be applicable for text search.
> >
> >WoW, that's a huge mail :D
> 
> Sorry, my post was too lengthy to find what is my proposal
> to poppler maintainers.
> 
> >So my understanding is that "proper" CJK searching is a lot
> >of work and you advocate for just exposing the raw text to
> >the upper layers (users of poppler-qt4) so they can do the
> >work if they need it?
> 
> Yes. I think exposing the raw text to the upper layers would
> be the reasonable starting point for various non-left-to-right
> scripts, because it is script-independent.
> 
> # about the insertion of the space (U+0020) between the words,
> # still I've not decided what is good.

I don't think this makes sense, if we are being raw, we should be raw, and 
adding a space that is not there is not being raw.

So if you agree on not adding the space i will commit your patch.

Albert

> 
> Also I've written a preliminary patch to modify TextPage::findText()
> in TextOutputDev to support the device created in rawOrder mode
> (if required, I will post here). Now I'm waiting for Cobra's feedback
> to see if it works for his purpose.
> 
> Regards,
> mpsuzuki
> _______________________________________________
> poppler mailing list
> poppler at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/poppler


More information about the poppler mailing list