xdg Digest, Vol 16, Issue 34

Lauri Watts lauri at kde.org
Wed Jul 27 19:47:37 EEST 2005


On Wednesday 27 July 2005 16.54, Rodney Dawes wrote:
> On Wed, 2005-07-27 at 01:36 +0200, Timo Stülten wrote:
> > > On 7/25/05, Christopher James Lahey <clahey at ximian.com> wrote:
> > > > Does anyone have any suggestions for how to proceed here?  I want to
> > > > make the mime system detect docbook, since there is a mime type for
> > > > it.
> > > >
> > > > > Alternatively, is it possible to specify an empty namespace and
> > > > > just specify the localname?  That way any xml docs that match as
> > > > > <article> or <book> would just be labeled as docbook.
> >
> > A lot of docbook files on my system only have <chapter>s in them. They do
> > not have a DOCTYPE, nor any URI.
> > Without a proper DOCTYPE/URI, there is no clean way to recognize them by
> > content as <article> and <chapter> are not very specific to docbook I
> > think. May be it's better to simply use a unique file extension
> > (=".docbook")? All chapter-files on my system here already end in
> > ".docbook".
>
> Right. Without proper definition of what the XML file is, there is no
> clean way to tell what the XML file is. Therefore, it should just be
> identified as XML, if the file extension is ".xml", and the type of
> XML in the file contents is not easily identifiable. I've also seen
> the extension ".dbk" used for docbook files. So, falling back to also
> using those extensions as matches, seems viable to me.
>
> However, we are not Windows, and should not rely solely on file
> extensions to assume content. Whatever application is writing out
> those "incomplete" docbook files, should probably be fixed to do the
> right thing, and write out a correct DOCTYPE and have proper reference
> to the DTD or namespaces being used, so that we can improve the accuracy
> of content type detection.

They are probably not incomplete, rather system entities (partial documents in 
separate files, to be included into another document). Putting a doctype in 
them would render the parent document invalid, which is a far more serious 
problem than any issues with file type detection.  

I don't believe there is any reliable way to distinguish docbook files as 
anything but generic XML or SGML.  The common elements are also likely common 
to other DTD's, and the very docbook specific ones are not used enough to 
rely on.

Regards
-- 
Lauri Watts
KDE Documentation: http://docs.kde.org
KDE on FreeBSD: http://freebsd.kde.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url : http://lists.freedesktop.org/archives/xdg/attachments/20050727/751afbd9/attachment.pgp 


More information about the xdg mailing list