Eric S. Raymond

Eric S. Raymond esr at thyrsus.com
Fri Jun 7 09:53:51 PDT 2013


Michael Meeks <michael.meeks at suse.com>:
> 	I was curious about what you'd like to hack on here :-)

I wrote and maintain a tool called 'doclifter' that lifts manual pages
(and most other kinds of documents witten in troff-based markups) into
DocBook-XML.  This is a useful tool for several reasons; one is that the 
XML can be used to generate higher-quality HTML than you get from a
presentation-level troff to HTML translation.  If all manual pages lifted
cleanly, generating a nice web view of all the world's documentation
would be easy.

Unfortunately, troff markup is such a badly structured tag soup that
automatic lifting doesn't always work. By dint of a bunch of compiler
technology and a couple hundred cliche-recognition rules, doclifter
does a pretty good job; on the 12K pages shipped with a stock Linux
distribution it lifts about 94% of the eligible targets cleanly
without patches.

Most of the remaining 6% of troff pages contain markup that is
outright broken even in troff terms.  Your pages, which had an
incorrect \fb where a \fB was needed, are good examples.

One of my longer-term projects is cleaning up the Linux/Unix manual-page
corpus so that remaining 4% gets fixed and becomes automatically liftable.
I've been working on this since 2002, and have shipped about 2000 patches
upstream to several hundred projects.

Recently I fixed up all the X man pages.  Current statistics:

11923	100%	Total pages in stock Ubuntu 13.04
917	7.69%	Already made from XML-DocBook or Doxygen, not eligible.
10270	86.14%	Clean lift from troff, no problems
721	6.02%	Clean lift with a fix patch.
8	0.07%	Internal error in doclifter
7	0.08%	Incorrect (non-validating) XML generated.

You just got your patches.  The LibreOffice pages now lift clean.

Very occasionally (once every year or two) I run a validation pass on
as much of the manual-page universe as I can easily get my hands on.
In the future, if your pages develop any problems due to careless
changes, I'll ship you another fix.  Otherwise I have no specially
concentrated interest in LibreOffice, sorry. I think it's a good thing
that the suite exists, but I don't use it myself.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>


More information about the LibreOffice mailing list