[ghns] Checking uploads for validity

Josef Spillner spillner at kde.org
Fri Feb 29 00:23:18 PST 2008


Hello,

The current XML Schema for GHNS uploads is too strict, because it enforces a 
certain order of elements. Here are my findings about what can be done about 
it. Spoiler: not much, I'm afraid :(

Basically, XML Schema gives us two angles of what to check in XML files: 
cardinality and order. Using xs:sequence, the cardinality can be controlled 
(good), but the order is fixed (not good). Using xs:all, the order doesn't 
matter (good), but the cardinality was for whatever reason designed to be 
restricted to 0 or 1. This is an unnecessary design weakness of XML Schema in 
my eyes.

I've tried to hack around the issue by creating my own schema language which 
can be transformed into proper XML Schema by building up an automaton which 
counts the number of elements unless they're unlimited. With 4 limited 
occurrences and 2 unlimited ones, this already generates dozens of groups. 
Despite the size (940 lines of XSD), it would work for our use case as a 
workaround.

There's a catch though in that some of the groups are recursive, and this 
seems to be forbidden by the specification, and e.g. xmllint also detects 
them. What's interesting according to my experiments:
- xmllint can be tricked into not detecting recursive groups by having a type 
definition somewhere in between
- the normative XSD file for XML Schema itself also contains a recursion, 
which is another weakness (and inconsistency!) of the specification.

I still would like to use GHNS as a showcase for good XML technologies. If XML 
Schema decides not to participate in this contest, then I'd like to hear 
about alternatives, especially with concrete syntax examples of how to 
achieve schemas for order-independent elements with predefined occurrence 
ranges for each of them.

Josef


More information about the ghns mailing list