EpiData XML File Format Specification

This document contains the specification for the EpiData XML File format (version 1).

The number of alphabets, character sets etc. on the supported platforms (Linux, Mac, Windows) in numerous countries is so large, that there is a need to find ways of saving data and documentation in a uniform way. One way of doing this would be to use a generally available and documented format such as the ODF standard, Stata binary files, The Data Documentation Initiative format (DDI) or maintain the well known REC + CHK file formats used in EpiData (see specification) and add some new specifications. For more discussions on formatted data file structures see here or consult the DDI format specifications .

The main requirements for the data format are:

  • speed of reading and writing.
  • uniformity of specification across operating systems and countries.
  • possibility of fixing data file errors in a standard text file editor.
  • minimal overhead due to general data format specification requirements.
  • The end user should not need to consider where this or that file came from.

Following some experimentation in mid 2009 and looking into DDI and ODF standards it was judged, due to speed and overhead issues to create a simplified EpiData specific adapted data file XML structure. Other data formats will be supported by export and/or import functionality.


The purpose of this wiki page is not to explain in details how the XML schema works and how it is constructed. The use of XML Schemas (also known as .xsd files) can be found on the W3C school and the specification for XML schema files can be found at the W3C website.

This page is intended to explain some of the constructs that is not possible to specify in the schema language. This applies to several aspects, such as logical constraints (also known as co-occurrence constraints) but also text within elements which must have a special pattern.

The full documentation for our schema file can be found here. This is an autogenerated list of html pages using the program oXygen XML and is continuiously being updated with better descriptions etc.

techdocs/xmlformat/specification.txt · Last modified: 2011/12/14 14:02 by torsten.bonde.christiansen
Recent changes RSS feed Debian Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki