Jusletter IT

What’s in an Interchange Standard for Legislative XML?

  • Author: Radboud Winkels
  • Region: Netherlands
  • Field of law: Legislative informatics
  • Collection: Festschrift Erich Schweighofer
  • Citation: Radboud Winkels, What’s in an Interchange Standard for Legislative XML?, in: Jusletter IT 22 February 2011
To make efficient and effective use of all sources of law electronically available in Europe, with all their different (XML) formats, we need an open interchange standard. Such a standard will enable public administrations to better serve their citizens and organisations, to link legal information from various levels of authority and different countries and languages. Moreover, it will enable companies active in the field of legal knowledge systems to design methods and tools that support a much larger market and it will protect customers of such companies from vendor lock-in. Such an interchange standard should obviously be jurisdiction and language independent. Furthermore, it should refrain from describing elements that are not specific for legal documents and it should refrain from describing elements that require interpretation of the content of these documents. Finally, the standard should allow for easy (external) linking of separate knowledge models of the content to the original sources of law. We claim that MetaLex meets these requirements best, and has been specifically designed to meet these requirements.

Inhaltsverzeichnis

  • 1. Introduction
  • 2. An Example
  • 3. Design Decisions
  • 3.1. The Customer is King
  • 3.2. Isomorphism
  • 4. Conclusions and Discussion
  • 5. References

1.

Introduction ^

[1]
In the last decennia, sources of law have become available electronically in all kinds of formats, and are available increasingly in XML. Unfortunately, every provider of information designs and uses its own DTD or schema, and most of these are proprietary as well. This inhibits linking or cross-referencing among sources of law from different providers, or even from the same provider by someone other than the owner. In addition, most of these formats originate from commercial publishers, are geared towards presentation and publication needs, and are language and jurisdiction specific. Even internationally operating publishers have separate formats for countries or jurisdictions in which they are active. The publications office of the EU has experience in publishing legislation and directives in all EU languages and has recently moved to XML, but their FORMEX V4 is also heavily aimed at publication and presentation1 . Other international attempts to arrive at some agreement on describing legal sources did not lead to useful results (yet)2 .
[2]
Therefore we decided some years ago to design an Open Interchange Standard for legal documents that should be language and jurisdiction independent and not specifically aimed at publication or presentation: MetaLex (first published in September 2001).3
[3]
In previous publications about the MetaLex XML schemas for legislation (i.e. Boeret al . 2002, Boeret al . 2004, Winkelset al . 2003) we have sketched a method for embedding and linking knowledge representations of the normative meaning of legislation, encoded in the Resource Description Framework4 (RDF) standard, several times. We however never proposed a standard for representing norms in RDF, because we felt that there is too little consensus in the field of AI & Law and adjacent fields of study on representing norms to define a schema that is acceptable to most of its potential users. In addition to that, the claim that MetaLex XML encoding is independent of jurisdiction is difficult to reconcile with embedding legal interpretation in the standard. We limited the MetaLex XML schema to what we feel is common ground functionality for all computer applications that use XML representations of legislation, irrespective of language and jurisdiction. It should not and does not contain elements for general presentation purposes only; that can be taken care of by other XML definitions designed for that purpose (e.g. WordML). On the other extreme, it should not and does not contain elements that require interpretation of the content of the source of law. MetaLex should capture the essence of legal documents (the grey area in Figure 1). Therefore, we decided to standardize only those metadata that pertain strictly to the legislation as a document and not the interpretation of its contents. Even these few attributes defining when legislation is in force turn out to be very complex and bring us back to the interpretation problem we were trying to avoid. There are of course a number of requirements on legislative XML that have little to do with legislation as such. Any rule of thumb that works for document mark-up in general, like separation of presentation and structure, is important for legislative XML as well. Other requirements are geared towards a context of use that is envisioned: legislative XML intended for automatic consolidation will contain more detailed structural mark-up than legislative XML primarily intended for access by text search engines that present matching texts on the level of detail that the user is supposed to read as a self-contained text, i.e. without anaphoric references to previous blocks of text, and things like that.
Figure 1: The aim of MetaLex XML is to capture the essence of legal texts as far as it is concerned with the text per se: the grey area

2.

An Example ^

[4]
The workshops on legislative XML are mostly concerned with requirements that are typical for legislation and semi-legislation, and do not occur elsewhere in the same form. Assigning an unambiguous identity to pieces of legislation for the purpose of reference, and distinguishing different versions of legislation are for instance subjects that have been discussed at length at the workshops.
[5]
Version management is a good example of how legislation differs from other documents. There are two fundamental reasons why version management is more demanding in this field than in other fields:

No automatic progress:

the last version of a document, whether it is a price list of a shop or a scientific textbook, is usually the best one for current use. In the case of the scientific textbook the underlying assumption is the assumption of progress. This assumption characterizes most knowledge-intensive fields: mankind continually learns new things. It is natural that a reference should point to the last version, unless told otherwise in the reference. In Law things are a little more complex: the rule of law demands that the law is predictable, and change itself is an undesirable thing even if the new regime is in itself an improvement over the old regime. Doctrines likestare decisis (to stand by things decided) andjurisprudence constante reflect this bias against change. To balance the requirements of predictability and progress legal systems usually employ migration regimes that respect old situations that were legal at the time. In areas of law that move relatively fast but have a big impact on people’s lives, like taxation, many regimes may still be operational at a given time. An old capital insurance policy that is thirty years old can for instance still be covered by the rules in force at that time. This means that a government that moves to publishing by internet cannot limit itself to the current corpus of legislation: previous versions have to be covered too. It for instance also means that the user of a search engine has to be able to decide in some way which regime applies to his case.

Backwards compatibility:

the way legislation is produced and made accessible is itself subject to the bias against change we noted. Legislation derives its very legitimacy from compliance with higher legislation that restricts the ways in which it can come about. Most practices in legislative document management date from days when even paper was a valuable thing5 and computers did not exist. Common practices include such things as publishing modifying acts instead of changing and republishing a consolidated act, a prohibition on re-numbering articles and changing titles, complex rules for determining temporal validity depending on a number of magical dates etc. A version management policy for legislative XML cannot deviate from the version management policy already embedded inside the legal system. This makes it hard to find common ground between legislators, and similarity of needs will not immediately translate into a standard for legislative XML that can be used in multiple jurisdictions. Even Content Management Systems cannot be copied unchanged by another jurisdiction unless it also wants to adopt part of the legal system it is based on.
[6]
These two reasons are specific instances of a more general consideration on the character of legislative text. Legislation defies the assumption of progress because it is anormative , or prescriptive, text. It tells us whatshould be the case, and not whatis the case. A better theory of whatis the case also changes our views of events in the past, which can be re-evaluated against the new theory. New legislation on the other hand usually has no bearing on what should have happened in the past. What happened in the past should be judged against legislation of the past.
[7]
The same argument about the difference between normal sources of knowledge and sources of law like legislation can be extended to jurisdictions. Legislation has no bearing on events that happened outside the jurisdiction that accepts the legislation as valid, complies with it, and enforces it.

3.

Design Decisions ^

[8]
The question is to what extent it is possible to separate the document from the more abstract role of legislation in a legal system. It is useful, in thinking about legislative XML, to distinguish legislation as a source of law from legislation as a document, and legislation as a normative system. With legislation as a source of law we mean those properties and parts that have a special meaning to the legal system, separate from the properties and parts of the document, or the interpretation of what legislation means as a source of law, or how it should be applied. A good example of the distinction between the document and the source of law is the difference in structure. Legislation structure tends to deviate from normal document templates in a number of ways. The document may have more structural elements than the legislation: the paragraph break without caption may occur inside a longer provision for purposes of readability in some languages or jurisdictions, but a legal professional would never cite some paragraphs inside a single provision. In Law, the full provision or a specific sentence would be the proper parts to refer to. That is why in MetaLex paragraph breaks inside a provision are not marked up, while in normal document mark-up languages sentences are not marked up. For presentation purposes, it is still important to allow the paragraph break, and this is accommodated by allowing other elements from XHTML or WordML to be mixed with MetaLex.
[9]
The distinction between the source of law itself and the normative system, that the interpretation of the source of law seems to imply, is more complicated. The date of enactment of a source of law is conceptually a simple thing but each jurisdiction has its own set of instruments manipulating its meaning, including retroaction, delayed action, un-enacting, juridical nullification, or the so-called opportunity principle. Our guiding principle is to try to remove legal interpretation from the schema, exactly because semantically unfounded interpretive information is a barrier to information sharing, in general and between jurisdictions6 . MetaLex XML only standardizes structure and designation of identity in legislation. Legislative XML as produced by a publications office will generally steer clear of adding information that can be construed as an officially sanctioned interpretation of the normative contents of legislation. Large user organizations – like a tax administration – may want to add such interpretations in the form of a knowledge representation that serves as the input to expert systems used inside the organization. They will generally speaking not want to embed it in the legislative XML, because that creates unnecessary maintenance if the XML documents are updated by the producer.
[10]
For such purposes there is a standardized method for linking knowledge representations of legislation in the Web Ontology Language (OWL) to the legislation as a (MetaLex) document. The problem is that the way normative statements are represented, and the expressiveness of such a representation, are very closely linked to document metadata (like the date of enactment, or the type of legislation) and the way document elements function as a backing for the normative statements. The date of enactment, for instance, plays a pivotal role in the legal principleLex Posterior derogat legi priori (later law dero gates earlier law): this is exactly the reason why legislation can manipulate the date of enactment of other legislation, and why this date of enactment is not simply the same as the time of publication. In other documents, dates are rarely very important.

3.1.

The Customer is King ^

[11]
The MetaLex schema is intended as an enabling technology for development of legal knowledge- based systems. This type of systems is increasingly used, but the development of an international market for legal – knowledge – based applications is retarded by the boundaries between jurisdictions. Part of the problem is the sources of law, because correct reference to legal sources as a warrant for legal reasoning is a shared characteristic of legal systems everywhere. As long as each type of source of law has to be handled in a slightly different way, because of slightly different traditions, economies of scale will not develop. This is an issue large institutional users of legislation are aware of, but legislators fail to take action beyond setting a standard for their own jurisdiction.

3.2.

Isomorphism ^

[12]
In Computer Science & Law it is generally accepted that knowledge representations that embody statutes should be isomorphic to the statute, in the sense of the word introduced by Bench-Capon (Bench-Capon & Coenen, 1992). The required granularity of this isomorphism depends on the purpose of the system. The primary purpose of the relation between source of law and knowledge fragment is as a warrant of the knowledge fragment. A secondary purpose of isomorphism between knowledge representation and source of law is maintenance. To Bench-Capon this use is even the most important one. The project in the Dutch Tax and Customs Administration described in Boeret al . (2004) is a good example of this use of isomorphism. If a statute changes and an organization uses many systems that are based on the statute, each system has to be adapted to reflect the new rules. Tax law tends to change quite often and makes extensive use of temporary migratory regimes. This means that systems are modified all the time. We have proposed a model for version management of sources of law based on MetaLex (Boeret al . 2004). Isomorphic reference requires that the legal structure is correct for referencing, that references are versioned, and that it is possible to construct composite references to multiple targets.

4.

Conclusions and Discussion ^

[13]
We have argued that we need an interchange XML standard for describing legal documents in order to make efficient and effective use of all the legal data electronically available nowadays. Such a standard should be language and jurisdiction independent, but law specific. It should capture the essence oflegal documents , ignoring characteristics of documents in general and elements that come from interpreting the content of the documents. Furthermore, it should enable external knowledge models about (the content of) these legal documents to link to the original sources at the right level of granularity – i.e. legally relevant subparts. We have further argued that MetaLex meets all these requirements and is therefore a good candidate for such an interchange standard. Other initiatives in Europe have arisen these last years to specify legal documents, notably legislation, in XML and make them accessible through the web. We already mentioned the FORMEX definition of the EU Publications Office. In Italy, the Norme-in-Rete (NIR) project defined several DTD’s for Italian legislation (from very strict and prescriptive to loose) and identifiers through URN’s7 . In Denmark the government is working on LexDania and the Swiss and Austrians are also busy trying to provide better access to their sources of law with the use of XML. Most of these initiatives are however very much tailored for the national situation, i.e. language and jurisdiction specific. Only FORMEX and the Swiss have to cater for many languages. Moreover, all approaches mix presentation elements and legal text elements in their definitions7. The NIR project also considers mixing content descriptions in their XML (a classification and representation of provisions).

5.

References ^

T.J.M. Bench-Capon, and F.P. Coenen. Isomorphism and Legal Knowledge Based Systems. AI and Law, Vol 1, No 1, pp. 65-86 (1992).
A. Boer, R. Hoekstra, R. Winkels, T. van Engers, and F. Willaert. MetaLex: Legislation in XML. In: T. Bench-Capon, A. Daskalopulu, and R. Winkels, editors, Legal Knowledge and Information Systems (Jurix 2002), pages 1–10, IOS Press, Amsterdam (2002).
A. Boer, R. Winkels, T. van Engers, and E. de Maat. A content management system based on an event-based model of version management information in legislation. In: T. Gordon, editor, Legal Knowledge and Information Systems. Jurix 2004: The Seventeenth Annual Conference. pages 19–28, IOS Press, Amsterdam (2004).
R. Winkels, A. Boer, and R. Hoekstra. MetaLex: An XML Standard for Legal Documents. In Proceedings of the XML Europe Conference, London (UK) (2003).



Radboud Winkels, Associate Professor, Leibniz Center for Law, University of Amsterdam, PO Box 1030, 1000 BA Amsteredam, Netherlands,R.G.F.Winkels@uva.nl ,http://leibnizcenter.org/~winkels/


  1. 1 http://formex.publications.eu.int.
  2. 2 LegalXML’s main activities evolve around court transactions and intelligence work; the European counterpart LeXML so far led to one DTD for German case law.
  3. 3 In 2006 we started a new attempt at an international open interchange standard for sources of law, through CEN, the European Normalisation Institute, with input from the previous MetaLex, Norme-in-Rete and Akoma Ntoso. This resulted in two CEN Workshop Agreements. More information at:www.metalex.eu .
  4. 4 www.w3.org/RDF .
  5. 5 Or reading skills for that matter; the mandatory opening clause of acts in the Netherlands contains the phrase «To all who read this or have it read to them we send greetings».
  6. 6 The harsh reality we have to live with is that metadata is usually not added by trained legal knowledge engineers, or lawyers.
  7. 7 www.normeinrete.it .