[PROPOSAL] Desktop Bookmarks

Tue May 10 16:09:07 EEST 2005

On Tue, 2005-05-10 at 08:56 -0400, Daniel Veillard wrote:
> > By now, the XML standards nest has grown to the point where it becomes
> > a necessity.
> 
>   XML-1.0 is a standard, except the GNOMEs nearly everybody in the IT
> world would never even suggest to push a subset implementation in a
> serious software stack. The only example I can think of is SOAP forbidding
> DOCTYPE but this was pushed though for security reasons, even then it
> stay very controversial.
>   The problem is not the support for all XML related "specs" it's to implement
> properly the minimal core and get something which won't screw up users in
> the long term.

No disagreement here.

> > If the subset you need is included in the XML subset called
> > GMarkup, then using the GMarkup parser is a lightweight solution. 
> > If not, using a real XML parser will not add much weight either.
> 
>   Ho do you know what in XML-1.0 will work which won't work in GMarkup ?
> http://developer.gnome.org/doc/API/2.0/glib/glib-Simple-XML-Subset-Parser.html
>   "the parser may accept documents that an XML parser would not"
> 
> which is typical of why you should never implement something like GMarkup
> from the point of view of a library framework. The problem is broken user data,
> I know what it means, libxml version 1 used to have some incompatibilities
> with the spec and fixing user data and user stack was a big pain, never again !

First of all, the sentence you cited is a bit unclear, since it is not
totally clear what "accepting a document" means (probably "parsing
without error"), and what a "real XML parser" is (is it e.g.
validating ?)

>   "However, invalid XML documents are not considered valid GMarkup documents."
> 
> the person who wrote this doesn't even understand the difference in language
> defined  in XML between the concept of well-formedness (i.e. the instance is
> correct from a XML-1.0 syntactic point of view) and the concept of
> validity (i.e. the instance is also correct from a set of extra rules
> provided by the document DTD - or Schemas). This mean someone coming
> from the XML world is likely to misunderstand the documentation. I assume
> what was really meant was
> 
>   "However, not well formed XML document are not considered correct 
>    GMarkup documents"
> 
>  Which could probably be challenged with some not so twisted case.
> The documentation doesn't say what happens in case of errors, nor what really
> happens if the document has an internal subset.

Sloppy language, sure. But then an introductory section of a software
manual is not a standards document. If you file a bug about this, I'll
have it fixed in no time.

>   Something like GMarkup makes some sense for parsing small chunks like
> menu descriptions which are embedded in a program. It makes no sense for
> user data, because the lack of specification and behaviour description, 
> if even a formal set of the syntax accepted makes it a risk for the processing
> of user data. It will make it a risk for user's bookmarks, for example
> XML-1.0 has some rules about attributes and CR/LF normalization that I doubt
> GMarkup would comply to, making a GMarkup based processing of the bookmark
> generate different informations than if the same subset was parsed with
> a conformant XML parser.

I have heard you say that before, and I have gone and fixed CR/LF
handling in GMarkup. Please provide examples of CR/LF mishandling in
GMarkup before repeating that claim.

Matthias