[Poppler-bugs] [Bug 47022] pdftohtml: control over word breaks
bugzilla-daemon at freedesktop.org
bugzilla-daemon at freedesktop.org
Sun Mar 11 07:11:10 PDT 2012
https://bugs.freedesktop.org/show_bug.cgi?id=47022
--- Comment #1 from Ihar Filipau <thephilips at gmail.com> 2012-03-11 07:11:10 PDT ---
Created attachment 58283
--> https://bugs.freedesktop.org/attachment.cgi?id=58283
the patch, v1
Add a control over word break threshold (the best name I could think up).
1. Add a new global variable `double wordBreakThreshold` in the pdftohtml.cc
Default value 10 percent
Later converted to internal coefficient by dividing by 100.
2. Add new command line parameter: -wbt <fp>
Value stored in the wordBreakThreshold variable.
3. After command line is parsed, covert the percentage into a coefficient.
4. HtmlOutputDev.cc, HtmlPage::addChar(): replace the hardcoded `0.1` with
the variable.
5. HtmlOutputDev.cc, HtmlPage::coalesce(): replace the hardcoded `0.1` with
the variable.
6. Document the parameter in the man page.
I was tempted to introduce a new bool function for the word break check, yet:
- the functionality is duplicated (as I have understood, the results of
word-breaking in addChar() are post-processed and largely overridden by the
::coalesce() method)
- there is a TODO in ::addChar() of which validity and applicability I'm not
sure.
--
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
More information about the Poppler-bugs
mailing list