[poppler] approches used for language detection on images ...
Albretch Mueller
lbrtchx at gmail.com
Thu Feb 6 11:29:52 UTC 2020
On 2/4/20, John Muccigrosso <muccigrosso at icloud.com> wrote:
> Tesseract can do multiple languages in one file. Try “-l eng+ita” for
> example.
Well, yes, but what can you do when you don't know the language on
which the other text might appear?
Say, the French expression "pied à terre" is used but someone lousily
writes it as "pied a terre" "pied" is an English word and "terre"
could be OCR'ed as "terse"
I do work on texts (mostly about philosophy) which include lots of
Latin and French Terms.
lbrtchx
More information about the poppler
mailing list