[poppler] utils manpages in mdoc(7)

Albert Astals Cid aacid at kde.org
Sun Nov 9 14:15:15 PST 2014


El Diumenge, 9 de novembre de 2014, a les 17:29:11, Jan Stary va escriure:
> On Nov 09 16:46:37, aacid at kde.org wrote:
> > El Diumenge, 9 de novembre de 2014, a les 13:21:45, Jan Stary va escriure:
> > > Currently, the manpages of the poppler utils
> > > are written in the legacy man(7) markup language.
> > > Below please find a proposed rewrite of pdfunite.1
> > > into the _semantic_ mdoc(7) language.
> > > 
> > > Both languages are well supported for decades,
> > > by groff (on most linuxes) and by mandoc (the BSDs).
> > > The advantage of the semantic markup is that it allows
> > > for cinstructions like "There is an optional -h flag"
> > > 
> > > 	.Op Fl h
> > > 
> > > as opposed to the physical markup of
> > > "type a bracket, switch to italics, type -h,
> > > swithc back to roman, typed the closing bracket"
> > > and similarly for other manpage constructions.
> > > See http://manpages.bsd.lv/ for an elaborate discussion
> > > on why this is a good thing.
> > 
> > Can we have a short summary?
> 
> The man(7) markup language uses _physical_ markup,
> such as "put this in boldface", "type a bracket here", etc.
> The mdoc(7) language is a _semantic_ markup: it describes
> the meaning and purpose, as opposed to details of presentation.
> 
> For example, this is how pdfimages.1 mentions the -f option:
> 
> 	.BI \-f " number"
> 
> This means: "switch to italics, type a dasf-ef,
> then type 'number' separated by a space".
> 
> This is how mdoc(7) describes the same:
> 
> 	.Fl f Ar number
> 
> This means: "There is an 'f' flag, which takes a 'number' argument".
> That's the _meaning_ of it. Presentational details such as
> "prepend the option with a dash" or "make it italics"
> or "separate the option flag and the argument name with a space"
> are presentational details, described in macros.
> 
> In a not-too-far-fetched analogy, this is like the difference
> in having "<b>stuff</b>" in your html code and properly tagging
> various classes of information as such, to be assigned a given
> presentation in a CSS.
> 
> 
> Both languages are well supported for over a decade.
> 
> On most linuxes, the rendering is done by groff(1),
> which understands "groff -man" and "groff -mdoc"
> (beside other things). Most linux distributions
> have their system manpages written in man(7).
> 
> On the *BSD family of systems, the manpages are gradully being
> rewritten into mdoc(7), and the rendering is gradually overtaken
> by mandoc(1), a replacement of groff. This supports both man(7) and mdoc(7)
> too (beside other things). On OpenBSD, for example, the system manpages
> have all been rewritten into mdoc(7). A lot of third-site software
> uses man(7), so part of the porting/packaging process is to either
> check that the upstream manpages render well with mandoc,
> or to use groff.
> 
> My main motivation here is to ease this, and the cleanest way
> I think is to have the semantic markup. The port of poppler,
> by the way, renders very well with mandoc -man, so there is no
> need to use groff. However, I think that this is independently
> an improvement of the poppler-utils manpages as such.

Personally i see it as a rewrite it just for the sake of rewrite, sure what we 
have may be ugly but it works and yes making stuff semantic is nicer from a 
theretical point of view, but i'm not sure it's worth the hassle to be honest.

On the other side if your patches gave the same output than the old ones i 
don't really care, but they give me different output, as an example

For some reason man thinks now pdfseparate is some kind of BSD tool and says
  BSD General Commands Manual
instead of
  General Commands Manual

Cheers,
  Albert


> 
> Below please find another example,
> pdfseparate.1 rewritten into mdoc(7).
> 
> > > Please let me know if there is any interest in this,
> > > I am willing to do the work.
> > > 
> > > 	Your happy user
> 
>  		Jan
> 
> 
> .Dd November 10, 2014
> .Dt PDFSEPARATE 1
> .Os
> .Sh NAME
> .Nm pdfseparate
> .Nd extract pages from a PDF document
> .Sh SYNOPSIS
> .Nm pdfseparate
> .Op Fl h
> .Op Fl v
> .Op Fl f Ar first
> .Op Fl l Ar last
> .Ar input
> .Ar name-pattern
> .Sh DESCRIPTION
> .Nm
> extracts individual pages from a PDF document.
> The input document must not be encrypted.
> .Pp
> The pages extracted from
> .Ar input
> are saved in individual output files named like
> .Ar name-pattern .
> The
> .Ar name-pattern
> must contain a
> .Dq %d
> placeholder if more than one page is to be be extracted.
> The
> .Dq %d
> will be replaced by the original page number.
> .Pp
> The options are as follows:
> .Pp
> .Bl -tag -width 8n -compact
> .It Fl f Ar first
> The first page to extract (start of input by default).
> .It Fl l Ar last
> The last page to extract (end of input by default).
> .It Fl h
> Print usage information.
> .It Fl v
> Print copyright and version information.
> .El
> .Sh EXAMPLES
> .Dl $ pdfseparate file.pdf file-%d.pdf
> .Pp
> extracts all pages from
> .Pa file.pdf .
> If
> .Pa file.pdf
> has 3 pages, the resulting files will be named
> .Pa sample-1.pdf ,
> .Pa sample-2.pdf
> and
> .Pa sample-3.pdf .
> .Sh SEE ALSO
> .Xr pdfunite 1
> .Pp
> .Lk http://poppler.freedesktop.org
> _______________________________________________
> poppler mailing list
> poppler at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/poppler



More information about the poppler mailing list