Exploring PSI-MI XML Collections using...
Exploring PSI-MI XML Collections using describeX
XML is widely used by bioinformatics community. They claim the flexibility of xml introduces heterogenity. I'd prefer to think that it allowed latent hetergenity to be exposed.
DescribeX builds xml structural summaries - labelled graphs of paths (xpath roots?) - basically, suffix-tree compression on the document structure. MM - is popularity of a node (element path type) correlated with importance of a node?
XSD schema doesn't constrain how you use the format - can often encode the same stuff in different ways while being complient.
All human-driven. It is basically semantic zooming, data mining of document structures. Doesn't tackle document content at all. I'm sure it's interesting to people managing xml formats (e.g. PSI) but is of no discernable use to me.