IVOA

 International

    Virtual

    Observatory

Alliance


VOTable Format Definition
Version 1.2

IVOA Proposed Recommendation 2009-07-10

This version:

Latest version:
http://www.ivoa.net/Documents/latest/VOT.html

Previous version(s):
http://www.ivoa.net/Documents/cover/VOT-20040811.html   V1.2 Working Draft (2009-06-13)
http://www.ivoa.net/Documents/cover/VOT-20040811.html   V1.1 (2004-08-11)
http://www.ivoa.net/Documents/PR/VOTable/VOTable-20031017.html   V1.0 (2002-04-15)

Editor(s):
François Ochsenbein

Author(s):
    François Ochsenbein   Observatoire Astronomique de Strasbourg, France
    Roy Williams   California Institute of Technology, USA
with contributions from:
    Clive Davenhall   University of Edinburgh, UK
    Daniel Durand   Canadian Astronomy Data Centre, Canada
    Pierre Fernique   Observatoire Astronomique de Strasbourg, France
    David Giaretta   Rutherford Appleton Laboratory, UK
    Robert Hanisch   Space Telescope Science Institute, USA
    Tom McGlynn   NASA Goddard Space Flight Center, USA
    Alex Szalay   Johns Hopkins University, USA
    Mark B. Taylor   Physics, Bristol University, UK
    Andreas Wicenec   European Southern Observatory, Germany


Abstract

This document describes the structures making up the version 1.2 of the VOTable standard, which supersedes the version 1.1 of 08 August 2004. The differences between versions 1.1 and 1.2 are summarized in the last section.

The main part of this document describes the adopted part of the VOTable standard; it is followed by appendices presenting extensions which have been proposed and/or discussed, but which are not part of the standard.

Status of this document

This is an IVOA Proposed Recommendation made available for public review. It is appropriate to reference this document only as a recommended standard that is under review and which may be changed before it is accepted as a full recommendation.

This proposed recommendation is made available for public review. Comments to this document should be sent to votable@ivoa.net, a mailing list with a public archive. It is appropriate to reference this document only as a recommended standard that is under review and which may be changed before it is accepted as a full recommendation.

A list of current IVOA Recommendatrions and other technical documents can be found at
http://ivoa.net/Documents/

Acknowledgments

This document is based on the W3C documentation standards, but has been adapted for the IVOA.

Contents:

1  Introduction

The VOTable format is an XML standard for the interchange of data represented as a set of tables. In this context, a table is an unordered set of rows, each of a uniform structure, as specified in the table description (the table metadata). Each row in a table is a sequence of table cells, and each of these contains either a primitive data type, or an array of such primitives. VOTable is derived from the Astrores format [1], itself modeled on the FITS Table format [2]; VOTable was designed to be close to the FITS Binary Table format.

1.1  Why VOTable?

Astronomers have always been at the forefront of developments in information technology, and funding agencies across the world have recognized this by supporting the Virtual Observatory movement, in the hopes that other sciences and business can follow their lead in making online data both interoperable and scalable.

VOTable is designed as a flexible storage and exchange format for tabular data, with particular emphasis on astronomical tables.

Interoperability is encouraged through the use of standards (XML). The XML fabric allows applications to easily validate an input document, as well as facilitating transformations through XSLT (eXtensible Style Language Transformation) engines.

Grid Computing

VOTable has built-in features for big-data and Grid computing. It allows metadata and data to be stored separately, with the remote data linked. Processes can then use metadata to `get ready' for their input data, or to organize third-party or parallel transfers of the data. Remote data allow the metadata to be sent in email and referenced in documents without pulling the whole dataset with it: just as we are used to the idea of sending a pointer to a document (URL) in place of the document, so we can now send metadata-rich pointers to data tables in place of the tables themselves. The remote data is referenced with the URL syntax protocol://location, meaning that arbitrarily complex protocols are allowed.

When we are working with very large tables in a distributed-computing environment (``the Grid"), the data stream between processors, with flows being filtered, joined, and cached in different geographic locations. It would be very difficult if the number of rows of the table were required in the header - we would need to stream in the whole table into a cache, compute the number of rows, then stream it again for the computation. In the Grid-data environment, the component in short supply is not the computers, but rather these very large caches. Furthermore, these remote data streams may be created dynamically by another process or cached in temporary storage: for this reason VOTable can express that remote data may not be available after a certain time (expires). Data on the net may require authentication for access, so VOTable allows expression of password or other identity information (the `rights' attribute).

Data Storage: Flexible and Efficient

The data part in a VOTable may be represented using one of three different formats: TABLEDATA, FITS and BINARY. TABLEDATA is a pure XML format so that small tables can be easily handled in their entirety by XML tools. The FITS binary table format is well-known to astronomers, and VOTable can be used either to encapsulate such a file, or to re-encode the metadata; unfortunately it is difficult to stream FITS, since the dataset size is required in the header (NAXIS2 keyword), and FITS requires a specification up front of the maximum size of its variable-length arrays. The BINARY format is supported for efficiency and ease of programming: no FITS library is required, and the streaming paradigm is supported.

We hope that VOTable can be used in different ways, as a data storage and transport format, and also as a way to store metadata alone (table structure only). In the latter case, we can imagine a VOTable structure being sent to a server, which can then open a high-bandwidth connection to receive the actual data, using the previously-digested structure as a way to interpret the stream of bytes from the data socket. VOTable can be used for small numbers of small records (pure XML tables), or for large numbers of simple records (streaming data), or it can be used for small numbers of larger objects. In the latter case, there will be software to spread large data blocks among multiple processors on the Grid. Currently the most complex structure that can be in a VOTable Cell is a multidimensional array.

1.2  XML Conventions

VOTable is constructed with XML (extensible Markup Language), a powerful standard for structured data throughout the Internet industries. It derives from SGML, a standard used in the publishing industry and for technical documentation for many years. XML consists of elements and payload, where an element consists of a start tag (the part in angle brackets), the payload, and an end tag (with angle brackets and a slash). Elements can contain other elements. Elements can also bear attributes (keyword-value combinations).

The payload may be in two forms: parsed or unparsed character data. Examples are:

<text>Fran&#231;ois</text>
<text><![CDATA[ a & (b <= c) ]]></text>

In the first example, the sequence &#231; is interpreted as part of the ISO/IEC 10646 character set (Unicode), and translates to an accented character, so that the text is ``François". The second example uses the special CDATA sequence so that the characters <, >, and & can be used without interpretation; in this case, any ASCII characters are allowed except the terminating sequence ]]> For more information, see any book on XML.

1.3  Syntax policy

Following the general XML rule, element and attribute names are case-sensitive and have to be used with the specified capitalisation. For VOTable, we have adopted the convention that element names are spelled in uppercase and attribute names in lowercase (with an exception for the ID attribute). Element and attribute names are further distinguished in this paper by being typed with a fixed-width font.

2  Data Model

In this section we define the data model of a VOTable, and in the next sections its syntax when expressed as XML. The data model of VOTable can be expressed as:

 VOTable = hierarchy of Metadata + associated TableData, arranged as a set of Tables
 Metadata = Parameters + Infos + Descriptions + Links + Fields + Groups
 Table = list of Fields + TableData
 TableData = stream of Rows
 Row = list of Cells
 Cell =
Primitive
or variable-length list of Primitives
or multidimensional array of Primitives
 Primitive = integer, character, float, floatComplex, etc (see table of primitives below).

Metadata is divided into that which concerns the table itself (parameters), and the definitions of the fields (or column attributes) of the table. Each FIELD represents the metadata that can be found at the top of the column in a paper version of the table: in the example introduced in the section below, the first FIELD has its name attribute set to "RA". The Field can be thought of as a class definition, and the table cells below it are the instances of that class.

A parameter (PARAM) is similar to a FIELD, except that it has a value attribute. Parameters can be seen as ``constant columns'', containing for instance FITS keywords or any other information pertaining to the table itself or its environment, such as the Telescope parameter in the example of section 3.1.

An informative parameter (INFO) (see INFO) is a restricted form of the PARAM -- it is always understood as a string (i.e. datatype="char" and arraysize="*" are implied).

The ordered list of Fields at the top of the table thus provides a template for a Row object (also called a record). The template allows interpretation of the data in the Row. The record is a set of Cells, with the number and order of Cells the same for each Row, and the same as the number of Fields defined in the Metadata.

In VOTable, there is generally no advance specification of the number of rows in the table: this is to allow streaming of large tables, as discussed above. However, if the number of rows is known, it may be specified in a dedicated nrows attribute.

From Version 1.1, columns may be logically grouped, so that it is possible to define table substructures made of column associations. Such an association is declared as a GROUP, which typically contains column references (FIELDref) and associated parameters (PARAM).

2.1  Primitives

datatype Meaning FITS Bytes
"boolean" Logical "L" 1
"bit" Bit "X" *
"unsignedByte" Byte (0 to 255) "B" 1
"short" Short Integer "I" 2
"int" Integer "J" 4
"long" Long integer "K" 8
"char" ASCII Character "A" 1
"unicodeChar" Unicode Character   2
"float" Floating point "E" 4
"double" Double "D" 8
"floatComplex" Float Complex "C" 8
"doubleComplex" Double Complex "M" 16

Each Cell is composed from Primitives, each of which is a datatype of fixed-length binary representation, as listed in the accompanying table. Cells may consist of a single Primitive (this is the default), or of an array (eventually multidimensional) of Primitives (see the next section).

Except for the Bit type, each primitive has the fixed length in bytes given in the table. Bit scalars and arrays are stored in the minimum number of bytes feasible (so that b bits take the integer part of (b+7)/8 bytes). These primitives are described in more detail in section 6.

VOTables support two kinds of characters: ASCII 1-byte characters and Unicode (UCS-2) 2-byte characters. Unicode is a way to represent characters that is an alternative to ASCII. It uses two bytes per character instead of one, it is strongly supported by XML tools, and it can handle a large variety of international alphabets. Therefore VOTable supports not only ASCII strings (datatype="char"), but also Unicode (datatype="unicodeChar").

Note that strings are not a primitive type: strings are represented in VOTable as an array of characters.

2.2  Columns as Arrays

A table cell can contain an array of a given primitive type, with a fixed or variable number of elements; the array may even be multidimensional. For instance, the position of a point in a 3D space can be defined by the following:

<FIELD ID="point_3D" datatype="double" arraysize="3"/>

and each cell corresponding to that definition must contain exactly 3 numbers. An asterisk (*) may be appended to indicate a variable number of elements in the array, as in:

<FIELD ID="values" datatype="int" arraysize="100*"/>

where it is specified that each cell corresponding to that definition contains 0 to 100 integer numbers. The number may be omitted to specify an unbounded array (in practice up to =~2×109 elements).

A table cell can also contain a multidimensional array of a given primitive type. This is specified by a sequence of dimensions separated by the x character, with the first dimension changing fastest; as in the case of a simple array, the last dimension may be variable in length. As an example, the following definition declares a table cell which may contain a set of up to 10 images, each of 64x64 bytes:

<FIELD ID="thumbs" datatype="unsignedByte" arraysize="64x64x10*"/>

Strings, which are defined as a set of characters, can therefore be represented in VOTable as a fixed- or variable-length array of characters:

<FIELD name="unboundedString" datatype="char" arraysize="*"/> A 1D array of strings can be represented as a 2D array of characters, but given the logic above, it is possible to define a variable-length array of fixed-length strings, but not a fixed-length array of variable-length strings. A convention to express an array of variable-length strings was proposed (see in the appendix) but is not part of this standard.

2.3  Compatibility with FITS Binary Tables

VOTable is closely compatible with the FITS Binary Table format. Henceforth, we shall abbreviate ``FITS Binary Table and its Conventions" simply by the word ``FITS". Given a FITS file that represents a binary table, the header may be converted to VOTable, with a pointer to the original file, or with the original file included directly in VOTable. Since the original file is still present, it is clear that no data has been lost. A PARAM element can be used to hold any FITS keyword with its value and comment string.

We might ask two more significant questions, about how much of the FITS header and data can be represented in VOTable. The answer is that there is considerable overlap.

For instance, the recommended formatting of the data for an edition of the data is expressed by the non-mandatory TDISP keyword: for example F12.4 means 12 characters are to be used, and 4 decimal places. This has been converted in VOTable as the attributes width and precision which, connected with datatype, are semantically identical to the TDISP keyword.

What can FITS do but not VOTable?

FITS has a complex semantics, with many conventions (see e.g. the Registry of FITS Conventions [11]) which have been developed mainly to be able to cope with the increasing complexity of the astronomical instrumentation. In the frame of the Virtual Observatory the complexity is described by means of data models, and from its version 1.1, VOTable can refer to these data models by means of the utype attribute described in section 4.6.

What can VOTable do but not FITS?

VOTable supports separating of data from metadata and the streaming of tables, and other ideas from modern distributed computing. It bridges two ways to express structured data: XML and FITS. It tries (through the UCD - see below) to express formally the semantic content of a parameter or field. It has the hierarchy and flexibility of XML: using GROUP elements introduced in version 1.1, columns in a VOTable can be grouped in arbitrarily complex hierarchies; and the ID attribute can be used in XML to enable what are essentially pointers. FITS does not handle Unicode (extended alphabet) characters.

It should be noticed that the transformation of FITS to VOTable is meant to be reversible: any FITS table can be converted to a VOTable without loss of information and the resulting VOTable can be converted back to a FITS table also without loss of information. However, it is possible to create new VOTables which cannot be converted to FITS tables without loss of information.

3  The VOTable Document Structure

The overall VOTable document structure is described and controlled by its XML Schema referenced at the top. That means that documents claiming to represent VOTables must include the reference to the VOTable schema, and pass through W3C XML Schema validators without error; notice that the validation is a necessary, but not sufficient, condition for correctness. The XML Schema of this version 1.2 is included in Appendix B, and is illustrated in section 7.

An example is used here to illustrate the components of a VOTable document described in the following sections. Basically, a VOTable document consists of a single all-containing element called VOTABLE, which contains descriptive elements and global definitions (DESCRIPTION, GROUP, PARAM, INFO), followed by one or more RESOURCE elements. Each Resource element contains one or more TABLE elements, and possibly other RESOURCE elements.

The TABLE element, the actual heart of VOTable, contains a description of the columns and parameters (described in the next section) followed by the data values (described in the following section).

3.1  Example

This simple example of a VOTable document lists 3 galaxies with their position, velocity and error, and their estimated distance. It contains a reference to the Space-Time Coordinate data model (STC, A. Rots [9]) implicitly used to specify the system of coordinates used to locate the observed galaxies in the sky: this is an essential difference from the previous versions of VOTable which made use of a COOSYS element for this specification.

<?xml version="1.0"?>
<VOTABLE version="1.2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xmlns="http://www.ivoa.net/xml/VOTable/v1.2" 
 xmlns:stc="http://www.ivoa.net/xml/STC/v1.30" >
  <RESOURCE name="myFavouriteGalaxies">
    <TABLE name="results">
      <DESCRIPTION>Velocities and Distance estimations</DESCRIPTION>
      <GROUP ID="J2000" utype="stc:AstroCoords">
        <PARAM datatype="char" arraysize="*" ucd="pos.frame" name="cooframe"
             utype="stc:AstroCoords.coord_system_id" value="UTC-ICRS-TOPO" />
        <FIELDref ref="col1"/>
        <FIELDref ref="col2"/>
      </GROUP>
      <PARAM name="Telescope" datatype="float" ucd="phys.size;instr.tel" 
             unit="m" value="3.6"/>
      <FIELD name="RA"   ID="col1" ucd="pos.eq.ra;meta.main" ref="J2000" 
             utype="stc:AstroCoords.Position2D.Value2.C1"
             datatype="float" width="6" precision="2" unit="deg"/>
      <FIELD name="Dec"  ID="col2" ucd="pos.eq.dec;meta.main" ref="J2000" 
             utype="stc:AstroCoords.Position2D.Value2.C2"
             datatype="float" width="6" precision="2" unit="deg"/>
      <FIELD name="Name" ID="col3" ucd="meta.id;meta.main" 
             datatype="char" arraysize="8*"/>
      <FIELD name="RVel" ID="col4" ucd="spect.dopplerVeloc" datatype="int"
             width="5" unit="km/s"/>
      <FIELD name="e_RVel" ID="col5" ucd="stat.error;spect.dopplerVeloc" 
             datatype="int" width="3" unit="km/s"/>
      <FIELD name="R" ID="col6" ucd="pos.distance;pos.heliocentric" 
             datatype="float" width="4" precision="1" unit="Mpc">
        <DESCRIPTION>Distance of Galaxy, assuming H=75km/s/Mpc</DESCRIPTION>
      </FIELD>
      <DATA>
        <TABLEDATA>
        <TR>
          <TD>010.68</TD><TD>+41.27</TD><TD>N  224</TD><TD>-297</TD><TD>5</TD><TD>0.7</TD>
        </TR>
        <TR>
          <TD>287.43</TD><TD>-63.85</TD><TD>N 6744</TD><TD>839</TD><TD>6</TD><TD>10.4</TD>
        </TR>
        <TR>
          <TD>023.48</TD><TD>+30.66</TD><TD>N  598</TD><TD>-182</TD><TD>3</TD><TD>0.7</TD>
        </TR>
        </TABLEDATA>
      </DATA>
    </TABLE>
  </RESOURCE>
</VOTABLE>

This simple VOTABLE document shows a single RESOURCE made of a single TABLE; the table is made of 6 columns, each described by a FIELD, and has one additional PARAM parameter (the Telescope). The actual rows are listed in the DATA part of the table, here in XML format (introduced by TABLEDATA); each cell is marked by the TD element, and follow the same order as their FIELD description: RA, Dec, Name, RVel, e_RVel, R.

3.2  name, ID and ref attributes

Most of the elements defined by VOTable may or have to bear names, like a RESOURCE, a TABLE, a PARAM or a FIELD. The contents of the name attribute is defined as a token XML type, that is a string of characters where the blanks and spaces are not meaningful (no leading or trailing spaces, no multiple spaces): name="NVSS flux(1.4GHz)" represents therefore a a valid name.

The ID and ref attributes are defined as XML types ID and IDREF respectively. It means that the contents of ID is an identifier which must be unique throughout a VOTable document, and that the contents of the ref attribute represents a reference to an identifier which must exist in the VOTable document. In other terms, if ref="myStar" is found in one element, there must exist an element in the same document with the ID="myStar" attribute. The XML standard moreover specifies that an ID type is a string beginning with a letter or underscore (_), followed by a sequence of letters, digits, or any of the punctuation characters . (dot), - (dash) or _ (underscore), but not the : (colon). Therefore ID="1" is not valid, but ID="_1" or ID="ref.1" are both valid.

The ID attribute is therefore required in the elements which have to be referenced, but the elements having an ID attribute need not to be referenced. In VOTable1.2, it is moreover recommended to place the ID attribute before referencing it whenever possible. While the ID attribute has to be unique in a VOTable document, the name attribute need not. It is however recommended, as a good practice, to assign unique names within a TABLE element. This recommendation means that, between a TABLE and the corresponding /TABLE tags, name attributes of FIELD, PARAM and optional GROUP elements should be all different.

3.3  VOTABLE element

The VOTABLE element may contain definitions consisting of a DESCRIPTION, followed by any mixture of parameters and informative notes eventually structured in groups. These elements represent values which are meaningfull over all tables included in a VOTABLE document -- definitions specific to a RESOURCE (section 3.4) or a TABLE (section 3.6) are better placed within their most appropriate element.

Note that version 1.0 of VOTable required the usage of a DEFINITIONS element holding the VOTable global definitions -- this usage is deprecated since the version 1.1

Space and Time coordinates

An essential difference with the version 1.1 of VOTable concerns the way adopted in version 1.2 to describe the coordinate system: a dedicated COOSYS element was defined in VOTable 1.0, which is deprecated in this version (1.2) in favor of a more generic facility of referring external data models.

The coordinates -- space and time, and eventually the spectral and redshift parameters -- are described in the STC model (A. Rots, see [9]), which specifies the various components and systems used in Astronomy to locate the events in time and space with a high accuracy.

From Version 1.2, VOTable suggests to make ue of the GROUP element (GROUP) and the utype attribute (utype) to describe with all the required accuracy the coordinate systems used in the data conveyed in a VOTable. A dedicated note on Referencing STC in VOTable [8] describes in more details how to express the coordinate components.

3.4  RESOURCE element

A VOTable document contains one or more RESOURCE elements, each of these providing a description and the data values of some logically independent data structure.

Each RESOURCE may include the descriptive element DESCRIPTION, followed by a mixture of INFO, GROUP and PARAM elements; it may also contain LINK elements to provide URL-type pointers that give further information.

The main component of a RESOURCE is typically one or more TABLE elements - in other terms a RESOURCE is basically a set of related tables. The RESOURCE is recursive (it can contain other RESOURCE elements), which means that the set of tables making up a RESOURCE may become a tree structure.

A RESOURCE may have one or both of the name or ID attributes (see section 3.2); it may also be qualified by type="meta", meaning that the resource is descriptive only, i.e. does not contain any actual data: no DATA element should exist in any of its sub-elements. A RESOURCE without this attribute may however have no DATA sub-element. Finally, the RESOURCE element may have a utype attribute to link the element to some external data model (introduced in version 1.1, see section 4.6).

3.5  LINK element

The role of the LINK element is to provide pointers to other documents or data servers on the Internet through a URI. In VOTable, the LINK element may be part of a RESOURCE, TABLE, GROUP, FIELD or PARAM elements. The href attribute of the LINK element can comprise any arbitrary protocol, for example "http://server/file" or "bizarre://server/file". VOTable parsers are not required to understand arbitrary protocols, but are required to understand the following three common protocols: "file:", "http:" and "ftp:". A GLU reference [5] is an additional high-level protocol introduced by a "glu:" value of the href attribute: this way of referencing a GLU is preferred to the gref attribute defined in the original version of VOTable. The gref attribute is deprecated since version 1.1.

In the Astrores format, from which VOTable is derived, there is additional semantics for the LINK element; the href attribute is used as a template for creating URL's. This behavior is explained in Appendix A.1, and it represents a possible extension of VOTable.

In addition to the referencing href attribute and to the naming name and ID attributes (see name and ID), the LINK element may announce the mime type of the data it references with a content-type attribute (e.g. content-type="image/fits"), and specify the role of the link by a content-role attribute (e.g.
content-role="doc" for access to documentation).

3.6  TABLE element

The TABLE element represents the basic data structure in VOTable; it is made of a description of the table structure (the metadata) essentially in the form of PARAM and FIELD elements (detailed in the next section), followed by the values of the described fields in a DATA element (detailed in the section below).

The TABLE element is always contained in a RESOURCE element: in other terms any TABLE element has a single parent made of the RESOURCE element in which the table is embedded.

The TABLE element contains a DESCRIPTION element for descriptive remarks, followed by a mixed collection of PARAM, FIELD or GROUP elements which describe a parameter (constant column), a field (column) or a group of columns respectively. PARAM and FIELD elements are detailed in the next section, and the GROUP element is presented in the following section.

Furthermore the TABLE element may contain LINK elements that provide URL-type pointers, exactly like the LINK elements existing within a RESOURCE element (see section 3.5).

The last element included in a TABLE is the optional DATA element (see below): a table without any actual data is quite valid, and is typically used to supply a complete description of an existing resource e.g. for query purposes.

The TABLE element may have the naming attributes name and/or ID (see name and ID conventions). A TABLE may also have a ref attribute referencing the ID of another table previously described, which is interpreted as defining a table having a structure identical to the one referenced: this facility avoids a repetition of the definition of tables which may be present many times in a VOTable document. It is recommended however that the ref attribute references an empty table (i.e. a table without a DATA part), which avoids any ambiguity about the referencing.

Finally, the TABLE element may have a utype and ucd attribute to specify the table semantics, similarly to the FIELD and PARAM elements (see section 4.1).

4  FIELDs and PARAMeters

The atoms of the table structure are represented by FIELD and PARAM elements, where FIELD represents the description of an actual table column, while PARAM supplies a value attached to the table, like the Telescope in the example of section 3.1. A PARAM may be viewed as a FIELD which keeps a constant value over all the rows of a table, and the only difference in the set of attributes of the two elements is the existence of a value attribute in a PARAM which does not exist in a FIELD.

The FIELD elements describe the actual columns of the table; the order in which the FIELDs are declared is important, as this order must be the same one as the order of the columns in the data part.

A FIELD or PARAM element may have several sub-elements, including the informational DESCRIPTION and LINK elements (a possibility of several descriptions and titles were proposed, see appendix on additional descriptions); it may also include a VALUES element that can express limits and ranges of the values that the corresponding cell can contain, such as minimum (MIN), maximum (MAX), or enumeration of possible values (OPTION).

4.1  Summary of attributes

The valid attributes of a FIELD or PARAM are:

In addition, in the PARAM element only:

4.2  Numerical Accuracy

The VOTable format is meant for transferring, storing, and processing tabular data, and is not intended for presentation purposes: therefore (in contrast to Astrores) we generally avoid giving rules on presentation, such as formatting. Inevitably however some at least of the data will have to be presented - either as actual tables, or in forms or graphs, etc... Two attributes were retained for this purpose:

The existence and presentation of the special null value of a field (when the actual value of the field is unknown) is another aspect of the numerical accuracy, which is part of the VALUES sub-element (see below).

4.3  Extended datatype xtype

The xtype attribute was added to expand the basic datatype primitives (in table of primitives) representing the storage units which are valid in any of the VOTable serialisations, and correspond therefore exactly to the FITS definitions.

In order to avoid possible name collisions in the contents of the xtype attribute, it is recommended to use a namespace prefix: an example could be xtype="obj:position" where obj represents the namespace (which should be specified by an xmlns definition) and position a name specific to the obj context.

There are however 2 values of xtype which have a particular importance in the context of the IVOA:

4.4  Units

The quantities in a column of the table may be expressed in some physical unit, which is specified by the unit attribute of the FIELD. The syntax of the unit string is defined in reference [3]; it is basically written as a string without blanks or spaces, where the symbols . or * indicate a multiplication, / stands for the division, and no special symbol is required for a power. Examples are unit="m2" for m2, unit="cm-2.s-1.keV-1" for cm-2s-1keV-1, or unit="erg/s" for erg s-1. The references [3] provide also the list of the valid symbols, which is essentially restricted to the Système International (SI) conventions, plus a few astronomical extensions concerning units used for time, angular, distance and energy measurements.

4.5  Unified Content Descriptors

The Unified Content Descriptors (UCD) can be viewed as a hierarchical glossary of the scientific meanings of the data contained in the astronomical tables. The initial version was created at CDS, but the UCD definition is currently evolving [4].

A few typical examples taken from the original UCD design:

"phot.mag;em.opt.B" Blue magnitude
"src.orbital.eccentricity" Orbital eccentricity
"time.period;stat.median" Median Value of the Period
"instr.det.qe" Detector's Quantum Efficiency

4.6  The utype attribute

In many contexts, it is important to specify that FIELDs or PARAMeters do convey the values defined in an external data model. For instance, it can be fundamental for an application to be aware that a given FIELD expresses the surface brightness measured with a specific filter and within a 12x6arcsec elliptical aperture. None of the other name, ID or ucd attributes can fill this role, and the utype (usage-specific or unique type) attribute has been introduced in VOTable 1.1 to fill this gap. By extension, most elements may refer to some external data model, and the utype attribute is legal also in RESOURCE, TABLE and GROUP elements.

In order to avoid name collisions, the data model identification should be introduced following the XML namespace conventions, as utype="datamodel_identifier:role_identifier". The mapping of "datamodel_identifier" to an xml-type attribute is recommended, by means of the xmlns convention which specifies the URI of the data model quoted, as done in the example of section 3.1.

The utype attribute is especially useful to specify the spatial and temporal coordinates present in the table when it contains astronomical events: these parameters are essential to most applications which process multi-wavelength data. Within the IVOA, the spatial and temporal frames are described in the STC data model (see Rots [9]), and it is expected that this STC-referencing replaces the usage of the COOSYS defined in the version 1.0 of VOTable.

The example given above (see section 3.1) gives an illustration of the recommended way of linking a VOTable document to the STC model. Other examples and details are presented in the dedicated note ``Referencing STC in VOTable''[8].

4.7  VALUES element

The VALUES element of the FIELD is designed to hold subsidiary information about the domain of the data. For instance, in the example (section 3.1) we could rewrite the RA field definition as:

      <FIELD name="RA" ID="col1" ucd="pos.eq.ra;meta.main" ref="J2000" 
             utype="stc:AstroCoords.Position2D.Value2.C1"
             datatype="float" width="6" precision="2" unit="deg">
        <VALUES ID="RAdomain">
          <MIN value="0"/>
          <MAX value="360" inclusive="no"/>
        </VALUES>
      </FIELD>
The scope of the domain described by the VALUES element (and by its MIN, MAX and OPTION sub-elements) can be qualified by type="actual", if it is valid only for the data enclosed in the parent TABLE; the default type="legal" qualification specifies the generic domain of valid values, as in the RAdomain in the example above where the interval [0,360[ is specified.

The VALUES element may contain MIN and MAX elements, and it may contain OPTION elements; the latter may itself contain more OPTION elements, so that a hierarchy of keyword-values pairs can be associated with each field. Note that a single pair MIN / MAX only is possible, whereas many OPTION elements may be found to qualify the domain described by the VALUES element. The domain may therefore be defined as a single interval, or as a set of individual values. Although the schema does not forbid all three MIN, MAX and OPTION sub-elements simultanesouly, such an usage is considered as bad practice and is discouraged.

All three MIN, MAX and OPTION sub-elements store their value corresponding to the minimum, maximum, or ``special value'' in a value attribute. MIN and MAX elements can have an inclusive attribute to specify whether the value quoted belongs or not to the domain, and the OPTION element can have a name attribute to qualify the ``special'' quoted value.

The VALUES element may also have a null attribute to define a non-standard value that is used to specify ``non-existent data'' - for example null="-32768". When this value is found in the corresponding data, it is assumed that no data exists for that table cell; the parser may choose to use this also when unparsable data is found, and the null value will be substituted instead. Section 6 indicates the default null values for each of the primitive data types when the TABLEDATA data representation is being used. Some of the primitive data types have one or more default null values defined (for the "char", "float" and "double" types, an empty cell may be used). Other types ("boolean", "unsignedByte", "short", and "int") have no default null value defined, and thus, when they are needed, they must be defined explicitly via the VALUES element.

For the FITS and BINARY data representations, the NaN (not-a-number) patterns are recommended to represent floating-point null values. The null convention is therefore only necessary for primitive types that do not have a natural null value: long, int, short, and byte datatypes.

Finally the ref attribute of a VALUES element can be used to avoid a repetition of the domain definition, by referring to a previously defined VALUES element having the referenced ID attribute. When specified, the ref attribute defines completely the domain without any other element or attribute, as e.g. <VALUES ref="RAdomain"/>

4.8  INFO element

The INFO element is a PARAM element restricted of string (i.e. datatype="char" and arraysize="*" are implied). It must also have a name attribute, and may have the other attributes allowed in a PARAM: ID, ref, unit, ucd and utype. And like the PARAM element, INFO may include the sub-elements DESCRIPTION, VALUES and LINK.

INFO is meant to convey informative details about the generation of the VOTABLE document. It may be present at the beginning or end of a VOTABLE or RESOURCE elements, or at the end of a TABLE. Typical usages of INFO include error reports, or explanations about choices made by the data processing system which generates the VOTable document.

4.9  GROUPing FIELDs and PARAMeters

The GROUP element was introduced in VOTable 1.1, to group together a set of FIELDs which are logically connected, like a value and its error. However, in order to avoid any confusion with the first version of VOTable which did not know the GROUP, all FIELDs are always defined outside any group, and the GROUP designates its member fields via FIELDref elements. A simple example of a group made of the velocity and its error, based on the example of section 3.1, can be the following:

    <GROUP name="Velocity">
      <DESCRIPTION>Velocity and its error</DESCRIPTION>
      <FIELDref ref="col4"/>
      <FIELDref ref="col5"/>
    </GROUP>

The GROUP element can have the name, ID, ucd, utype and ref attributes. It can include a DESCRIPTION, and any mixture of FIELDreferences, PARAMeters, PARAMreferences and other GROUPs. PARAMref is a logical definition of a parameter by referring to a PARAM element defined elsewhere in the parent TABLE or RESOURCE, in a way similar to the FIELDref element defined by referring to a FIELD element defined elsewhere in the parent TABLE. The recursivity of the GROUP element enables a definition of arbitrary complex structures.

The possibility of adding PARAMeters in groups introduces also a possibility of associating parameter(s) to describe accurately the context of the data stored in the table: for instance, it is possible to associate the actual frequency of a radio survey with the following declaration:

    <FIELD name="Flux" ID="col4" ucd="phot.flux;em.radio.200-400MHz" 
           datatype="float" width="6" precision="1" unit="mJy"/>
    <FIELD name="e_Flux" ID="col5" datatype="float" width="4" precision="1"
           ucd="stat.error;phot.flux;em.radio.200-400MHz" unit="mJy"/>
    <GROUP name="Flux" ucd="phot.flux;em.radio.200-400MHz">
      <DESCRIPTION>Flux measured at 352MHz</DESCRIPTION>
      <PARAM name="Freq" ucd="em.freq" unit="MHz" datatype="float" 
             value="352"/>
      <FIELDref ref="col4"/>
      <FIELDref ref="col5"/>
    </GROUP>

Similarly, the GROUP can be used to associate several parameters to one or several FIELDs: a filter may for instance be characterized by the central wavelength and the FWHM of its transmission curve; or several parameters of an instrument setup may be detailed.

4.10  The relational context

With a simple naming convention, the GROUP element may also specify some properties of the tables included in a VOTable document when a TABLE is viewed as a relation (part of a a relational data-base):

Similar conventions could well be added for the existence of indexes, unique values, etc...

5  Data Content

While the bulk of the metadata of a VOTable document is in the FIELD elements, the data content of the table is in a single DATA element. The data is organized in ``reading" order, so that the content of each row appears in the same order as the order of the FIELD definitions.

Each DATA part of the VOTable document can be viewed as a stream coming out of a pipeline. The abstract table is first serialized by one of several methods, then it may be encoded for compression or other reasons. The result may be embedded in the XML file (local data), or it may be remote data.

The figure shows how the abstract table is rendered into the VOTable document. First the data is serialized, either as XML, a FITS binary table, or the VOTable Binary format. This data stream may then be encoded, perhaps for compression or to convert binary to text. Finally, the data stream may be put in a remote file with a URL-type pointer in the VOTable document; or the table data may be embedded in the VOTable.

The serialization elements and their attributes are described in the next sections.

5.1  TABLEDATA Serialization

The TABLEDATA element is a way to build the table in pure XML, and has the advantage that XML tools can manipulate and present the table data directly. The TABLEDATA element contains TR elements, which in turn contain TD elements -- i.e. the same conventions as the familiar HTML ones. The number of TD elements in each TR element must be equal to the number of FIELD elements declaring the table. An example is contained in section 3.1, surrounded by in the <TABLEDATA> and </TABLEDATA> delimiters.

Each item in the TD tag contains a value which must be compatible with the datatype attribute of the corresponding FIELD definition. If the value is the same as the null value for that field, then the item is assumed to contain no data. Valid representations of values in a cell, depending on their datatype, are detailed in the complete description of datatypes.

If a cell contains an array of numbers or a complex number, it should be encoded as multiple numbers separated by whitespace. However in the case of character and Unicode strings (declared in the corresponding FIELD as an array of char or unicodeChar datatype), no separator should exist. Here is an example of a table with a two rows, that has arrays in the table cells:

<TABLE>
  <FIELD ID="aString" datatype="char" arraysize="10"/>
  <FIELD ID="Floats" datatype="float" arraysize="3"/>
  <FIELD ID="varComplex" datatype="floatComplex" arraysize="*"/>
  <DATA><TABLEDATA>
  <TR>
   <TD>Apple</TD><TD>1.62 4.56 3.44</TD>
   <TD>67 1.57  4 3.14  77 -1.57</TD>
  </TR><TR>
   <TD>Orange</TD><TD>2.33 4.66 9.53</TD>
   <TD>39 0  46 3.14</TD>
  </TR>
  </TABLEDATA></DATA>
</TABLE>
The first entry is a fixed-length array of 10 characters; since the value being presented (Apple) has 5 characters, this is padded with trailing blanks. The second cell is an array of three floats. The last cell contains a variable array of complex numbers, each complex number being represented by its real part followed by at least a blank and its imaginary part - hence 6 numbers for 3 complex numbers, or 4 numbers for 2 complex numbers.

A special notice should be mentioned about the significance of white space in a table cell (the term whiteSpace designates the characters space [x20], tab [x09], newline [x0a], carriage-return [x0d]): while for numeric data types the amount of white spaces does not matter (the elements of an array of numbers may for instance be written on several lines), the white space is significant for "char" or "unicodeChar" datatypes, and for instance <TD>Apple</TD> and <TD> Apple</TD> are not identical.

5.2  FITS Serialization

The FITS format for binary tables [2] is in widespread use in astronomy, and its structure has a major influence on the VOTable specification. Metadata is stored in a header section, followed by the data. The metadata is substantially equivalent to the metadata of the VOTable format. One important difference is that VOTable does not require specification of the number of rows in the table, an important freedom if the table is being created dynamically from a stream.

The VOTable specification does not define the behavior of parsers with respect to this doubling of the metadata. A parser may ignore the FITS metadata, or it may compare it with the VOTable metadata for consistency, or other possibilities.

The following code shows a fragment that might have been created by a FITS-to-VOTable converter. Each FITS keyword has been converted to a PARAM, and the data itself is remotely stored and gzipped at an ftp site:

<RESOURCE>
<PARAM name="EPOCH" datatype="float" value="1999.987">
Original Epoch of the coordinates
</PARAM>
<PARAM name="TELESCOP" datatype="char" arraysize="*" value="VTel" />
<INFO name="HISTORY">
<DESCRIPTION>The very first Virtual Telescope observation made in 2002</DESCRIPTION>
</INFO>
<TABLE>
<FIELD  (insert field metadata here) />
<DATA><FITS extnum="2">
<STREAM encoding="gzip" href="ftp://archive.cacr.caltech.edu/myfile.fit.gz"/>
</FITS></DATA>
</TABLE>
</RESOURCE>

The FITS file may contain many data objects (known as extensions, numbered from 1 up, the main header being numbered 0), and the extnum attribute allows the VOTable to point to one of these.

5.3  BINARY Serialization

The binary format is intended to be easy to read by parsers, so that additional libraries are not required. It is just a sequence of bytes, the length of each sequence corresponding to the datatype and arraysize attributes of the FIELD elements in the metadata. The binary format consists of a sequence of records, with no header bytes, no alignment considerations, no block sizes. The order of the bytes in multi-byte primitives (e.g. integers, floating-point numbers) is Most Significant Byte first, i.e. it follows the FITS convention.

Table cells may contain arrays of primitive types, each of which may be of fixed or variable length. In the former case, the number of bytes is the same for each instance of the item, as specified by the arraysize attribute of the FIELD. If all the fields have a fixed arraysize, then each record of the binary format has the same length (the sum of arraysize times the length in bytes of the corresponding datatype).

Variable-length arrays of primitives are preceded by a 4-byte integer containing the number of items of the array. The way the stream of bytes is arranged for the data of the example in section 5 is illustrated in Figure 2. The parser can then compute the number of bytes taken by the variable-length array by multiplying the size and number of the primitives.

 

5.4  Data Encoding

As a result of the serialization, the table has been converted to a byte stream, either text or binary. If the TABLEDATA serialization is used, then the table is represented as XML tags directly embedded in the document, and conventional tools can be used to encode the entire XML document. However, VOTable also provides limited encoding of its own. A VOTable document may point to a remote data resource that is compressed; rather than decompressing before sending on the wire, it can be dynamically decoded by the VOTable reader. We might also use the encoding facilities to convert a binary file to text (through base64 encoding), so that binary data can be used in the XML document.

In this version (1.2) of VOTable, it is not possible to encode individual columns of the table: the whole table must be encoded in the same way. The possibility of encoding selected table cells is however being examined for future versions of VOTable (see appendix below).

In order to use an encoding of the data, it must be enclosed in a STREAM element, whose attributes define the nature of the encoding. The encoding attribute is a string that should indicate to the parser how to undo the encoding that has been applied. Parsers should understand and interpret at least the following values:

The default value of the encoding attribute is the null string, meaning that no encoding has been applied. In future releases, we might allow more complex strings in the encoding attribute, allowing combinations of encoding filters and a way for the parser to find the software needed for the decoding.

5.5  Remote Data

If the encoding of the data produces text, or if the serialization is naturally text-based, then it can be directly embedded into the XML document, as for instance: <DATA><BINARY>
<STREAM encoding="base64">
AAAAAj/yVZiDGSSUwFZ6ypR4yGkADwAcQV0euAAIAAJBmMzNwZWZmkGle4tBR3jVQT9ocwAA
························
</STREAM>
</BINARY></DATA>

However, if the data is very large, it may be preferable to keep the data separate from the metadata. The href attribute of the STREAM element, if present, provides the location of the data in a URL-type syntax, for example:

<STREAM href="ftp://server.com/mydata.dat"/>

<STREAM href="ftp://server.com/mydata.dat" expires="2004-02-29T23:59:59"/>

<STREAM href="httpg://server.com/mydata.dat" actuate="onLoad"/>

<STREAM href="file:///usr/home/me/mydata.dat"/>

The examples are the well-known anonymous ftp, and http protocols. "httpg" is an example of a Grid-based access to data through httpg; finally, "file" is a reference to a local file. VOTable parsers are not required to understand arbitrary protocols, but are required to understand the three common protocols "file:", "http:" and "ftp:".

There are further attributes of the STREAM element that may be useful. The expires attribute indicates the expiration time of the data: this is useful when data are dynamically created and stored on some staging disk where files only persist for a specified lifetime and are then automatically deleted. The expires attribute expresses when a remote resource ceases to become valid, and is expressed in Universal Time in the same way as the FITS specification [2], itself conforming to the ISO 8601 standard.

The rights attribute expresses authentication information that may be necessary to access the remote resource. If the VOTable document is suitably encrypted, this attribute could be used to store a password.

The actuate attribute is borrowed from the XML Xlink specification, expressing when the remote link should be actuated. The default is "onRequest", meaning that the data is only fetched when explicitly requested (like a link on an HTML page), and the "onLoad" value means that data should be fetched as soon as possible (like an embedded image on an HTML page).

6  Definitions of Primitive Datatypes

This section describes the primitives summarized in the table of primitives and their representations in the BINARY and in the TABLEDATA serializations (see section 5). In the following, the term ``hexadigit'' designates the ASCII numbers "0" to "9", or the ASCII lower- or upper-case letters "a" to "f" (i.e. a digit in an hexadecimal representation of a number).

7  A simplified view of the VOTable 1.2 Schema

The XML Schema [7] defining a VOTable 1.2 document is available from http://www.ivoa.net/xml/VOTable/v1.2 In this section we illustrate this XML Schema by a set of boxes describing the structure of a VOTable, and the list of attributes of each VOTable element.

7.1  Element Hierarchy

The hierachy of the elements existing in VOTable-1.2 is illustrated below; it uses the following conventions:

<VOTABLE>
  <DESCRIPTION>
  <COOSYS>···
  <INFO>···
  <PARAM>···
  <GROUP>···
  <RESOURCE>···
  <INFO>···
</VOTABLE>
<RESOURCE>
  <DESCRIPTION>
  <INFO>···
  <COOSYS>···
  <GROUP>···
  <PARAM>···
  <LINK>···
  <TABLE>···
  <RESOURCE>···
  <INFO>···
</RESOURCE>
<TABLE>
  <DESCRIPTION>
  <FIELD>···
  <PARAM>···
  <GROUP>···
  <LINK>···
  <DATA>
  <INFO>···
</TABLE>
<FIELD>
  <DESCRIPTION>
  <VALUES>
  <LINK>···
</FIELD>
 
<PARAM>
  <DESCRIPTION>
  <VALUES>
  <LINK>···
</PARAM>
<DATA>
<TABLEDATA>
    <TR>···
      <TD>···
<BINARY>
    <STREAM>
<FITS>
    <STREAM>
</DATA>
  <INFO>···
<GROUP>
  <DESCRIPTION>
  <FIELDref>···(t)
  <PARAM>···
  <PARAMref>···
  <GROUP>···
</GROUP>
(t) only within <TABLE>
<INFO>
  <DESCRIPTION>
  <VALUES>
  <LINK>···
</INFO>
<VALUES>
  <MIN>
  <MAX>
  <OPTION>···
    <OPTION>···
</VALUES>
 
 

7.2  Attribute summary

The list of the attributes is summarized in the table below, with the following conventions:

VOTABLE
(definition)
ID
version
RESOURCE
(definition)
ID
name
type
utype
TABLE
(definition)
ID
name
ucd
utype
ref
nrows
INFO
(definition)
ID
name
value
xtype
ref
unit
ucd
utype
STREAM
(definition)
type
href
actuate
encoding
expires
rights
FITS
(definition)
extnum
 
TR
(definition)
ID
 
TD
(definition)
encoding
 
 
 
GROUP
(definition)
ID
name
ref
ucd
utype
 
PARAM
(definition)
ID
unit
datatype
precision
width
xtype
ref
name
ucd
utype
arraysize
value
FIELD
(definition)
ID
unit
datatype
precision
width
xtype
ref
name
ucd
utype
arraysize
type
FIELDref
(definition)
ref
ucd
utype
 
 
 
PARAMref
(definition)
ref
ucd
utype
MIN
(definition)
value
inclusive
 
MAX
(definition)
value
inclusive
 
OPTION
(definition)
name
value
VALUES
(definition)
ID
type
null
ref
 
LINK
(definition)
ID
content-role
content-type
title
value
href
action

7.3  Mime type

Finally, a VOTable document should be introduced by a mime type (Multipurpose Internet Mail Extensions, defined in the RFC 2046): associating a mime type to a document enables the data consumer (an application or a web browser) to launch the desired application (e.g. a visualisation tool).

In the HTTP protocol, the mime type is the value specified by the Content-Type: line. The recommended mime-type describing a VOTable document is application/x-votable+xml: the x- prefix indicates an experimental type, and is required for non-registered media types; and the +xml suffix (defined by RFC 3023 section 7) indicates that the type describes a specialization of XML.

However the text/xml mime type is acceptable for services delivering data which are expected to be visualized by humans in a browser; this mime type would preferably be associated with an XSL style sheet, for a presentation of well-formatted tables. It is expected that a few typical XSL style sheets will be accessible from the IVOA site.

8  Differences between versions 1.1 and 1.2

The differences between version 1.2 of VOTable and the preceding version 1.1 are:

  • the COOSYS is deprecated, in favor of a reference to the Space-Time Coordinate (STC) data model (see utype and the IVOA note Referencing STC in VOTable[8])
  • the xtype attribute was added (see section 4.3)
  • the INFO element (INFO) is made similar to the PARAM element, but with datatype="char" and arraysize="*" (i.e. is a String)
    • may have attributes utype, ucd, ref, unit
    • accept sub-elements DESCRIPTION, VALUES and LINK
  • the INFO element may occur before the closing tags /DATA, /TABLE and /RESOURCE and /VOTABLE (enable post-operational diagnostics)
  • the FIELDref and PARAMref elements may have a utype and ucd attribute.
  • naming conventions of GROUP elements which specify some properties of a relational schema (see section 4.10).
  • explicitation of the recommended and acceptable mime types (section 7.3)
  • explicitation of arrays in cells (section 2.2)
  • detailed and clarified the conventions and recommendations concerning name, ID and ref attributes
  • appendix A7 was a proposition for additional utype attributes in groups and tables; it is now included in VOTable1.2;
  • appendix A7 contains a new proposal (May/June 2009) for multiple descriptions and titles.

9  References

[1] Accomazzi et. al, Describing Astronomical Catalogues and Query Results with XML
http://cds.u-strasbg.fr/doc/astrores.htx

[2] FITS: Flexible Image Transport Specification, specifically the Binary Tables Extension
http://fits.gsfc.nasa.gov/

[3] Standards for Astronomical Catalogues: Units, CDS Strasbourg
http://cdsarc.u-strasbg.fr/doc/catstd-3.2.htx
See also Section 4 in Greisen and Calabretta 2002, A&A 395, 1061; and the IAU Recommendations concerning Units from the IAU Style Manual by G.A. Wilkins (1989) available at http://www.iau.org/science/publications/proceedings_rules/units/

[4] Unified Content Descriptors
http://cds.u-strasbg.fr/doc/UCD.htx (UCD1)
http://www.ivoa.net/twiki/bin/view/IVOA/IvoaUCD

[5] GLU: Générateur de Liens Uniformes, CDS Strasbourg
http://simbad.u-strasbg.fr/glu/glu.htx

[6] ASU: Astronomical Server URL, CDS Strasbourg
http://cds.u-strasbg.fr/doc/asu.html

[7] XML Schema: W3C Document
http://www.w3.org/XML/Schema

[8] Referencing STC in VOTable
http://ivoa.net/Documents/latest/VOTableSTC.html

[9] Arnold Rots Space-Time Coordinate Metadata for the Virtual Observatory (v1.30)
http://ivoa.net/Documents/latest/STC.html

[10] Arnold Rots STC-S: Space-Time Coordinate (STC) Metadata Linear String Implementation
http://www.ivoa.net/Documents/latest/STC-S.html

[11] Registry of FITS conventions
http://fits.gsfc.nasa.gov/fits_registry.html


   Appendices

A  Possible VOTable extensions

The definitions enclosed in this appendix are not part of VOTable 1.1, but are considered as candidates for VOTable improvements.

A.1  VOTable LINK substitutions

The LINK element in Astrores [1] contains a mechanism for string substitution, which is a powerful way of defining a link to external data which adapts to each record contained in the table DATA.

When a LINK element appears within a RESOURCE or a TABLE element, extra functionality is implied: the href attribute may not be a simple link, but instead a template for a link. If, in the example of myFavouriteGalaxies, we add the link

  <LINK href="http://ivoa.net/lookup?Galaxy=${Name}&amp;RA=${RA}&amp;DE=${DE}"/>

a substitution filter is applied in the context of a particular row. For the first row of the table, the substitution would result in the URL

   http://ivoa.net/lookup?Galaxy=N%20224&RA=010.68&DE=%2b41.27

Whenever the pattern ${...} is found in the original link, the part in the braces is compared with the set of ID (preferably) or name attributes of the fields of the table. If a match is found, then the value from that field of the selected row is used in place of the ${...}. If no match is found, no substitution is made. Thus the parser makes available to the calling application a value of the href attribute that depends on which row of the table has been selected. Another way to think of it is that there is not a single link associated with the table, but rather an implicitly defined new column of the table. This mechanism can be used to connect each row of the table to further information resources.

The purpose of the link is defined by the content-role attribute. The allowed values are "query" (see query mechanism), "hints" for information for use by the application, and "doc" for human-readable documentation. The column names invoked in the pattern of the href attribute of the LINK element should exist in the document to generate meaningful links. In the common case where the VOTable was generated from a query of a database and contains only some of the columns in that database, it might be necessary to include columns additional to those requested in order to ensure that the LINKS in the VOTable are operational. Such a FIELD included ``by necessity'' is marked with the attribute type="hidden". The primary key of a relational table is a typical example of a FIELD which would carry the type="hidden" attribute.

A.2  VOTable Query Extension

The metadata part included in a RESOURCE contains all the details necessary to create a form for querying the resource. The addition of a link having the action attribute can turn VOTable into a powerful query interface.

In Astrores [1], the details on the input parameters available in queries are described by the PARAM and FIELD elements, and the syntax used to generate the actual query is described in the ASU [6] procotol: the FIELD or PARAM elements are paired in the form name=value, where name is the contents of the name attribute of a FIELD or PARAM, and value represents a constraint written with the ASU conventions (e.g. "<8" or "12.0..12.5" which denotes a range of values). Such pairs are appended to the action specified in the LINK element contained in the RESOURCE, separated by the ampersand (&) symbol - in a way quite similar to the HTML syntax used to describe a FORM.

A special type="no_query" attribute of the PARAM or FIELD elements marks the fields which are not part of the form, i.e. are ignored in the collection of name=value pairs.

The following is an example of a transformation of the VOTable in the example into a form interface:

<?xml version="1.0"?>
<VOTABLE version="1.2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xmlns="http://www.ivoa.net/xml/VOTable/v1.2" 
 xmlns:stc="http://www.ivoa.net/xml/STC/v1.30" >
  <RESOURCE name="myFavouriteGalaxies" type="meta">
    <TABLE name="results">
      <DESCRIPTION>Velocities and Distance estimations</DESCRIPTION>
      <GROUP ID="J2000" utype="stc:AstroCoords">
        <PARAM datatype="char" arraysize="*" ucd="pos.frame" name="cooframe"
             utype="stc:AstroCoords.coord_system_id" value="UTC-ICRS-TOPO" />
        <FIELDref ref="col1"/>
        <FIELDref ref="col2"/>
      </GROUP>
      <PARAM name="-out.max" ucd="meta.number" datatype="int" value="50">
        <DESCRIPTION>Maximal number of records to retrieve</DESCRIPTION>
      </PARAM>
      <LINK  content-role="query" action="myQuery?-source=myGalaxies&amp;" />
      <DESCRIPTION>Velocities and Distance estimations</DESCRIPTION>
      <FIELD name="RA"   ID="col1" ucd="pos.eq.ra;meta.main" ref="J2000" 
             utype="stc:AstroCoords.Position2D.Value2.C1"
             datatype="float" width="6" precision="2" unit="deg"/>
      <FIELD name="Dec"  ID="col2" ucd="pos.eq.dec;meta.main" ref="J2000" 
             utype="stc:AstroCoords.Position2D.Value2.C2"
             datatype="float" width="6" precision="2" unit="deg"/>
      <FIELD name="Name" ID="col3" ucd="meta.id;meta.main" 
             datatype="char" arraysize="8*"/>
      <FIELD name="RVel" ID="col4" ucd="spect.dopplerVeloc" datatype="int"
             width="5" unit="km/s"/>
      <FIELD name="e_RVel" ID="col5" ucd="stat.error;spect.dopplerVeloc" 
             datatype="int" width="3" unit="km/s"/>
      <FIELD name="R" ID="col6" ucd="pos.distance;pos.heliocentric" 
             datatype="float" width="4" precision="1" unit="Mpc">
        <DESCRIPTION>Distance of Galaxy, assuming H=75km/s/Mpc</DESCRIPTION>
      </FIELD>
    </TABLE>
  </RESOURCE>
</VOTABLE>

Note that the RESOURCE displaying the parameters accessible for a query has the type="meta" attribute; it is also assumed that only one LINK having the content-role="query" attribute together with an action attribute exists within the current RESOURCE. The PARAM with name="-out.max" has been added in this example to control the size of the result.

A valid query generated by this VOTable could be:

  myQuery?-source=myGalaxies&-out.max=50&R=10..100

A.3  Arrays of variable-length strings

Following the FITS conventions, strings are defined as arrays of characters. This definition raises problems for the definition of arrays of strings, which have then to be defined as 2D-arrays of characters - but in this case only the slowest-varying dimension (i.e. the number of strings) can be variable. This limitation becomes severe when a table column contains a set of remarks, each being made of a variable number of characters as occurs in practice.

FITS invented the Substring Array convention (defined in an appendix, i.e. not officially approved) which defines a separator character used to denote the end of a string and the beginning of the next one. In this convention (rA:SSTRw/ccc) the total size of the character array is specified by r, w defines the maximum length of one string, and ccc defines the separator character as its ASCII equivalent value. The possible values for the separator includes the space and any printable character, but excludes the control characters.

Such arrays of variable-length strings are frequently useful e.g. to enumerate a list of properties of an observed source, each property being represented by a variable-length string. A convention similar to the FITS one could be introduced in VOTable in the arraysize attribute, using the s followed by the separator character; an example can be arraysize="100s," indicating a string made of up to 100 characters, where the comma is used to separate the elements of the array.

A.4  FIELDs as data pointers

Rather than requiring that all data described in the set of FIELDs are contained in a single stream which follows the metadata part, it would be possible to let the FIELD act as a pointer to the actual data, either in the form of a URI or of a reference to a component of a multipart document.

Each component of the data described by a FIELD may effectively have different requirements: while text data or small lists of numbers are quite efficiently represented in pure XML, long lists like spectra or images generate poor performances if these are converted to XML. The method available to gain efficiency is to use a binary representation of the whole data stream by means of the STREAM element - at the price of delivering data in a totally non-human readable format.

The following options would allow more flexibility in the way the various FIELDs can be accessed:

Note that the LINK is not required - a FIELD declared with type="location" and containing no LINK element is assumed to contain URIs.

An example of a table describing a set of spectra could look like the following:

<TABLE name="SpectroLog">
  <FIELD name="Target" ucd="meta.id" datatype="char" arraysize="30*"/>
  <FIELD name="Instr" ucd="instr.setup" datatype="char" arraysize="5*"/>
  <FIELD name="Dur" ucd="time.expo" datatype="int" width="5" unit="s"/>
  <FIELD name="Spectrum" ucd="meta.ref.url" datatype="float" arraysize="*"
         unit="mW/m2/nm" type="location">
    <DESCRIPTION>Spectrum absolutely calibrated</DESCRIPTION>
    <LINK type="location" 
        href="http://ivoa.spectr/server?obsno="/>
  </FIELD>
  <DATA><TABLEDATA>
    <TR><TD>NGC6543</TD><TD>SWS06</TD><TD>2028</TD><TD>01301903</TD></TR>
    <TR><TD>NGC6543</TD><TD>SWS07</TD><TD>2544</TD><TD>01302004</TD></TR>
  </TABLEDATA></DATA>
</TABLE>
The reading program has therefore to retrieve the data for this first row by resolving the URI http://ivoa.spectr/server?obsno=01301903

The same method could also be immediately applicable to Content-IDs which designate elements of a multipart message, using the protocol prefix cid: [RFC2111]

Note that the VOTable LINK substitution proposed in Appendix A fills a similar functionality: generate a pointer which can incorporate in its address components from the DATA part for the VOTable.

A.5  Encoding individual table cells

Accessing binary data improves quite significantly the efficiency both in storage and CPU usage, especially when one compares with the XML-encoded data stream. But binary data cannot be included in the same stream as the metadata description, unless a dedicated coding filter is applied which converts the binary data into an ASCII representation. The base64 is the most commonly used filter for this conversion, where 3 bytes of data are coded as 4 ASCII characters, which implies an overhead of 33% in storage, and some (small) computing time necessary for the reverse transformation.

In order to keep the full VOTable document in a unique stream, VOTable 1.0 introduced the encoding attribute in the STREAM element, meaning that the data, stored as binary records, are converted into some ASCII representation compatible with the XML definitions. One drawback of this method is that the entire data contents become non human-readable. The addition of the encoding attribute in the TD element allows the data server to decide, at the cell level, whether it is more efficient to distribute the data as binary-encoded or as edited values. The result may look like the following:

<TABLE name="SpectroLog">
  <FIELD name="Target" ucd="meta.id" datatype="char" arraysize="30*"/>
  <FIELD name="Instr" ucd="instr.setup" datatype="char" arraysize="5*"/>
  <FIELD name="Dur" ucd="time.expo" datatype="int" width="5" unit="s"/>
  <FIELD name="Spectrum" ucd="phot.flux;em.opt" datatype="float" arraysize="*"
         unit="mW/m2/nm" precision="E3"/>
  <DATA><TABLEDATA>
    <TR><TD>NGC6543</TD><TD>SWS06</TD><TD>2028</TD><TD encoding="base64">
    QJKPXECHvndAgMScQHul40CSLQ5ArocrQLxiTkC3XClAq0OWQKQIMUCblYFAh753QGij10BT
    Em9ARKwIQExqf0BqbphAieuFQJS0OUCJWBBAhcrBQJMzM0CmRaJAuRaHQLWZmkCyhytAunbJ
    QLN87kC26XlA1KwIQOu+d0DsWh1A5an8QN0m6UDOVgRAxO2RQM9Lx0Din75A3o9cQMPfO0C/
    dLxAvUeuQKN87kCXQ5ZAjFodQH0vG0B/jVBAgaHLQI7Ag0CiyLRAqBBiQLaXjUDYcrBA8p++
    QPcKPUDg7ZFAwcKPQLafvkDDlYFA1T99QM2BBkCs3S9AjLxqQISDEkCO6XlAmlYEQKibpkC5
    wo9AvKPXQLGBBkCs9cNAuGp/QL0euEC4crBAuR64QL6PXEDOTdNA2987QN9T+EDoMSdA8mZm
    QOZumEDDZFpAmmZmQGlYEEBa4UhAivGqQLel40Dgan9A4WBCQLNcKUCIKPZAk1P4QNWRaEEP
    kWhBKaHLQTkOVkFEan9BUWBCQVyfvg==
    </TD></TR>
  </TABLEDATA></DATA>
</TABLE>

When decoded, the contents of the last column is the binary representation of the spectrum, as defined in the BINARY serialization; no length prefix is required here, the total length of the array being implicitly defined by the length of the encoded text.

A.6  Very large arrays

The BINARY serialization of variable-length arrays (section 5.3) uses a 4-byte prefix containg the number of items of the array. This convention imposes an absolute maximal number of 231-1 elements. This limit could be releaved with a new arrayprefix attribute.

A.7  Additional descriptions and titles

The same table may be used in several contexts, and it was for instance expressed a wish to include in TABLE and FIELD descriptions and titles (captions) in a form suitable for a publication (latex) in addition to the ascii-only descriptions currently acceptable. The following example is an illustration of this extension:
<TABLE name="Model_A">
  <DESCRIPTION>Star luminosities in Model A</DESCRIPTION>
  <DESCRIPTION context="latex">$L(T_{eff})$ in Model {\bf A}</DESCRIPTION>
  <FIELD name="Teff" datatype="float" unit="K" ucd="phys.temperature.effective">
     <DESCRIPTION>Effective temperature</DESCRIPTION>
     <TITLE context="latex">$T_{eff}$</TITLE>
  </FIELD>
  <FIELD name="Lum" datatype="float" unit="Lsun" ucd="phys.luminosity">
     <DESCRIPTION>Corresponding luminosity in Model A</DESCRIPTION>
     <DESCRIPTION context="latex">$L(T_{eff})$</DESCRIPTION>
     <TITLE context="latex">$L/L_\odot$</TITLE>
  </FIELD>
</TABLE>

In practice this extension would mean that, wherever a DESCRIPTION element is currently acceptable, a set of DESCRIPTION and TITLE elements would become acceptable, each with an optional context additional attribute. The new TITLE element would have the role of expliciting the column header in a field or parameter, or to supply a caption of a table or a set of tables (resource) in addition to its description.

Providing descriptions in several languages would be another obvious advantage of this extension.

A.8  A new XMLDATA serialization

In order to facilitate the use of standard XML query tools which usually require each parameter to have its own individual tag, the XMLDATA serialization introduces the designation of each FIELD by a dedicated tag. An example could look like the following:

<TABLE name="Messier">
  <FIELD name="Number" ID="M" ucd="meta.record" datatype="int" >
    <DESCRIPTION>Messier Number</DESCRIPTION>
  </FIELD>
  <FIELD name="R.A.2000" ID="RA" ucd="pos.eq.ra;meta.main" ref="J2000" 
         unit="deg" datatype="float" width="5" precision="1" />
  <FIELD name="Dec.2000" ID="DE" ucd="pos.eq.dec;meta.main" ref="J2000" 
         unit="deg" datatype="float" width="5" precision="1" />
  <FIELD name="Name" ID="N" ucd="meta.id" datatype="char" arraysize="*">
    <DESCRIPTION>Common name used to designate the Messier object</DESCRIPTION>
  </FIELD>
  <FIELD ID="T" name="Classification" datatype="char" arraysize="10*" 
         ucd="src.class">
     <DESCRIPTION>Classification (galaxy, glubular cluster, etc)</DESCRIPTION>
  </FIELD>
  <DATA><XMLDATA>
    <TR>
      <M>3</M>
      <RA>205.5</RA>
      <DE>+28.4</DE>
      <N/>
      <T>Globular Cluster</T>
    </TR>
    <TR>
      <M>31</M>
      <RA>010.7</RA>
      <DE>+41.3</DE>
      <N>Andromeda Galaxy</N>
      <T>Galaxy</T>
    </TR>
  </XMLDATA></DATA>
</TABLE>

The full document would need an XML-Schema definition of the tags M, RA, DE, N and T; these being derived directly from the ID attribute of the FIELD element, their definition can be generated automatically from the set of FIELD definitions.

B  The VOTable/v1.2 XML Schema

The XML Schema of VOTable 1.1 corresponding to this version of VOTable is available from http://www.ivoa.net/xml/VOTable/v1.2. It is also included here as a reference.
<?xml version="1.0" encoding="UTF-8"?>
<!--W3C Schema for VOTable  = Virtual Observatory Tabular Format
.Version 1.0 : 15-Apr-2002
.Version 1.09: 23-Jan-2004 Version 1.09
.Version 1.09: 30-Jan-2004 Version 1.091
.Version 1.09: 22-Mar-2004 Version 1.092
.Version 1.094: 02-Jun-2004 GROUP does not contain FIELD
.Version 1.1 :  10-Jun-2004 remove the complexContent
.Version 1.11: GL: 23-May-2006 remove most root elements, use name= type= iso ref= structure
.Version 1.11: GL: 29-Aug-2006 review and added comments (prefixed by GL) 
              before sending to Francois Ochsenbein
.Version 1.12: FO: Preliminary Version 1.2
.Version 1.18: FO: Tested (jax) version 1.2
.Version 1.19: FO: Completed INFO attributes
.Version 1.20: FO: Added xtype; content-role is less restrictive (May2009)
.Version 1.20: FO: PR-20090710 Cosmetics.
-->
<xs:schema 
   xmlns:xs="http://www.w3.org/2001/XMLSchema" 
   xmlns="http://www.ivoa.net/xml/VOTable/v1.2"
   targetNamespace="http://www.ivoa.net/xml/VOTable/v1.2" 
>
<xs:annotation><xs:documentation>
    VOTable1.2 is meant to serialize tabular documents in the
    context of Virtual Observatory applications. This schema
    corresponds to the VOTable document available from
    http://www.ivoa.net/Documents/latest/VOT.html
</xs:documentation></xs:annotation>

<!-- Here we define some interesting new datatypes:
     - anyTEXT   may have embedded XHTML (conforming HTML)
     - astroYear is an epoch in Besselian or Julian year, e.g. J2000
     - arrayDEF  specifies an array size e.g. 12x23x*
     - dataType  defines the acceptable datatypes
     - ucdType   defines the acceptable UCDs (UCD1+)
     - precType  defines the acceptable precisions
     - yesno     defines just the 2 alternatives
-->

<xs:complexType name="anyTEXT" mixed="true">
  <xs:sequence>
    <xs:any minOccurs="0" maxOccurs="unbounded" processContents="skip"/>
  </xs:sequence>
</xs:complexType>

<xs:simpleType  name="astroYear">
  <xs:restriction base="xs:token">
    <xs:pattern  value="[JB]?[0-9]+([.][0-9]*)?"/>
  </xs:restriction>
</xs:simpleType>

<xs:simpleType  name="ucdType">
  <xs:restriction base="xs:token">
    <xs:annotation><xs:documentation>
      Accept UCD1+
      Accept also old UCD1 (but not / + %) including SIAP convention (with :)
    </xs:documentation></xs:annotation>
    <xs:pattern  value="[A-Za-z0-9_.:;\-]*"/><!-- UCD1 use also / + % -->
  </xs:restriction>
</xs:simpleType>

<xs:simpleType  name="arrayDEF">
  <xs:restriction base="xs:token">
    <xs:pattern  value="([0-9]+x)*[0-9]*[*]?(s\W)?"/>
  </xs:restriction>
</xs:simpleType>

<xs:simpleType  name="encodingType">
  <xs:restriction base="xs:NMTOKEN">
    <xs:enumeration value="gzip"/>
    <xs:enumeration value="base64"/>
    <xs:enumeration value="dynamic"/>
    <xs:enumeration value="none"/>
  </xs:restriction>
</xs:simpleType>

<xs:simpleType name="dataType">
  <xs:restriction base="xs:NMTOKEN">
    <xs:enumeration value="boolean"/>
    <xs:enumeration value="bit"/>
    <xs:enumeration value="unsignedByte"/>
    <xs:enumeration value="short"/>
    <xs:enumeration value="int"/>
    <xs:enumeration value="long"/>
    <xs:enumeration value="char"/>
    <xs:enumeration value="unicodeChar"/>
    <xs:enumeration value="float"/>
    <xs:enumeration value="double"/>
    <xs:enumeration value="floatComplex"/>
    <xs:enumeration value="doubleComplex"/>
  </xs:restriction>
</xs:simpleType>

<xs:simpleType name="precType">
  <xs:restriction base="xs:token">
    <xs:pattern value="[EF]?[1-9][0-9]*"/>
  </xs:restriction>
</xs:simpleType>

<xs:simpleType name="yesno">
  <xs:restriction base="xs:NMTOKEN">
    <xs:enumeration value="yes"/>
    <xs:enumeration value="no"/>
  </xs:restriction>
</xs:simpleType>

  <xs:complexType name="Min">
    <xs:attribute name="value" type="xs:string" use="required"/>
    <xs:attribute name="inclusive" type="yesno" default="yes"/>
  </xs:complexType>
  <xs:complexType name="Max">
    <xs:attribute name="value" type="xs:string" use="required"/>
    <xs:attribute name="inclusive" type="yesno" default="yes"/>
  </xs:complexType>
  <xs:complexType name="Option">
    <xs:sequence>
      <xs:element name="OPTION" type="Option" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:attribute name="name" type="xs:token"/>
    <xs:attribute name="value" type="xs:string" use="required"/>
  </xs:complexType>
  
  <!-- VALUES expresses the values that can be taken by the data 
    in a column or by a parameter
  -->
  <xs:complexType name="Values">
    <xs:sequence>
      <xs:element name="MIN" type="Min" minOccurs="0"/>
      <xs:element name="MAX" type="Max" minOccurs="0"/>
      <xs:element name="OPTION" type="Option" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:attribute name="ID" type="xs:ID"/>
    <xs:attribute name="type" default="legal">
      <xs:simpleType>
        <xs:restriction base="xs:NMTOKEN">
          <xs:enumeration value="legal"/>
          <xs:enumeration value="actual"/>
        </xs:restriction>
      </xs:simpleType>
    </xs:attribute>
    <xs:attribute name="null" type="xs:token"/>
    <xs:attribute name="ref"  type="xs:IDREF"/>
    <!-- xs:attribute name="invalid" type="yesno" default="no"/ -->
  </xs:complexType>
  
  <!-- The LINK is a URL (href) or some other kind of reference (gref) -->
  <xs:complexType name="Link">
    <xs:annotation><xs:documentation> 
    content-role was previsouly restricted as: <![CDATA[
    <xs:attribute name="content-role">
      <xs:simpleType>
        <xs:restriction base="xs:NMTOKEN">
          <xs:enumeration value="query"/>
          <xs:enumeration value="hints"/>
          <xs:enumeration value="doc"/>
          <xs:enumeration value="location"/>
        </xs:restriction>
      </xs:simpleType>
    </xs:attribute>]]>; is now a name token.
    </xs:documentation></xs:annotation>
    <xs:attribute name="ID" type="xs:ID"/>
    <xs:attribute name="content-role" type="xs:NMTOKEN"/>
    <xs:attribute name="content-type" type="xs:NMTOKEN"/>
    <xs:attribute name="title" type="xs:string"/>
    <xs:attribute name="value" type="xs:string"/>
    <xs:attribute name="href" type="xs:anyURI"/>
    <xs:attribute name="gref" type="xs:token"/><!-- Deprecated in V1.1 -->
    <xs:attribute name="action" type="xs:anyURI"/>
  </xs:complexType>
  
<!-- INFO is defined in Version 1.2 as a PARAM of String type 
<xs:complexType name="Info">
  <xs:complexContent>
    <xs:restriction base="Param">
      <xs:attribute name="unit" fixed=""/>
      <xs:attribute name="datatype" fixed="char"/>
      <xs:attribute name="arraysize" fixed="*"/>
    </xs:restriction>
  </xs:complexContent>
</xs:complexType>
-->
<!-- Rather than a restriction, full definition -->
<xs:complexType name="Info">
  <xs:sequence> 
  <xs:element name="DESCRIPTION" type="anyTEXT" minOccurs="0"/>
    <xs:element name="VALUES" type="Values" minOccurs="0"/>
    <xs:element name="LINK" type="Link" minOccurs="0" maxOccurs="unbounded"/> 
  </xs:sequence>
  <xs:attribute name="name" type="xs:token" use="required"/>
  <xs:attribute name="value" type="xs:string" use="required"/>
  <xs:attribute name="ID" type="xs:ID"/>
  <xs:attribute name="unit" type="xs:token"/>
  <xs:attribute name="xtype" type="xs:token"/>
  <xs:attribute name="ref" type="xs:IDREF"/>
  <xs:attribute name="ucd" type="ucdType"/>
  <xs:attribute name="utype" type="xs:string"/>
</xs:complexType>

<!-- OLD INFO definition:
<xs:complexType name="Info">
  <xs:simpleContent>
    <xs:extension base="xs:string">
      <xs:attribute name="ID" type="xs:ID"/>
      <xs:attribute name="name" type="xs:token" use="required"/>
      <xs:attribute name="value" type="xs:string" use="required"/>
    </xs:extension>
  </xs:simpleContent>
</xs:complexType>
-->

<!-- Expresses the coordinate system we are using --><!-- Deprecated V1.2 -->
<xs:complexType name="CoordinateSystem">
  <xs:annotation><xs:documentation>
    Deprecated in Version 1.2
  </xs:documentation></xs:annotation>
  <xs:simpleContent>
    <xs:extension base="xs:string">
      <xs:attribute name="ID" type="xs:ID" use="required"/>
      <xs:attribute name="equinox" type="astroYear"/>
      <xs:attribute name="epoch" type="astroYear"/>
      <xs:attribute name="system" default="eq_FK5">
        <xs:simpleType>
          <xs:restriction base="xs:NMTOKEN">
            <xs:enumeration value="eq_FK4"/>
            <xs:enumeration value="eq_FK5"/>
            <xs:enumeration value="ICRS"/>
            <xs:enumeration value="ecl_FK4"/>
            <xs:enumeration value="ecl_FK5"/>
            <xs:enumeration value="galactic"/>
            <xs:enumeration value="supergalactic"/>
            <xs:enumeration value="xy"/>
            <xs:enumeration value="barycentric"/>
            <xs:enumeration value="geo_app"/>
          </xs:restriction>
        </xs:simpleType>
      </xs:attribute>
    </xs:extension>
  </xs:simpleContent>
</xs:complexType>

<xs:complexType name="Definitions">
  <xs:annotation><xs:documentation>
    Deprecated in Version 1.1
  </xs:documentation></xs:annotation>
  <xs:choice minOccurs="0" maxOccurs="unbounded">
    <xs:element name="COOSYS" type="CoordinateSystem"/><!-- Deprecated in V1.2 -->
    <xs:element name="PARAM" type="Param"/>
  </xs:choice>
</xs:complexType>

<!-- FIELD is the definition of what is in a column of the table -->
<xs:complexType name="Field">
  <xs:sequence> <!-- minOccurs="0" maxOccurs="unbounded" -->
    <xs:element name="DESCRIPTION" type="anyTEXT" minOccurs="0"/>
    <xs:element name="VALUES" type="Values" minOccurs="0"/> <!-- maxOccurs="2" -->
    <xs:element name="LINK" type="Link" minOccurs="0" maxOccurs="unbounded"/>
  </xs:sequence>
  <xs:attribute name="ID" type="xs:ID"/>
  <xs:attribute name="unit" type="xs:token"/>
  <xs:attribute name="datatype" type="dataType" use="required"/>
  <xs:attribute name="precision" type="precType"/>
  <xs:attribute name="width" type="xs:positiveInteger"/>
  <xs:attribute name="xtype" type="xs:token"/>
  <xs:attribute name="ref" type="xs:IDREF"/>
  <xs:attribute name="name" type="xs:token" use="required"/>
  <xs:attribute name="ucd" type="ucdType"/>
  <xs:attribute name="utype" type="xs:string"/>
  <xs:attribute name="arraysize" type="xs:string"/>
    <!-- GL: is the next deprecated element remaining 
        (is not in PARAM, but will in new model be inherited) 
    -->
  <xs:attribute name="type">
    <!-- type is not in the Version 1.1, but is kept for
         backward compatibility purposes
    -->
    <xs:simpleType>
      <xs:restriction base="xs:NMTOKEN">
        <xs:enumeration value="hidden"/>
        <xs:enumeration value="no_query"/>
        <xs:enumeration value="trigger"/>
        <xs:enumeration value="location"/>
      </xs:restriction>
    </xs:simpleType>
  </xs:attribute>
</xs:complexType>


<!-- A PARAM is similar to a FIELD, but it also has a "value" attribute -->
<!--  GL: implemented here as a subtype as suggested we do in Kyoto. -->
<xs:complexType name="Param">
  <xs:complexContent>
    <xs:extension base="Field">
      <xs:attribute name="value" type="xs:string" use="required"/>
    </xs:extension>
  </xs:complexContent>
</xs:complexType>


<!-- GROUP groups columns; may include descriptions, fields/params/groups -->
<xs:complexType name="Group">
  <xs:sequence>
    <xs:element name="DESCRIPTION" type="anyTEXT" minOccurs="0"/>
<!--  GL I guess I can understand the next choice element as one may (?) 
      really want to group fields and params and groups in a particular order.
-->    
    <xs:choice minOccurs="0" maxOccurs="unbounded">
      <xs:element name="FIELDref" type="FieldRef"/> 
      <xs:element name="PARAMref" type="ParamRef"/> 
      <xs:element name="PARAM" type="Param"/> 
      <xs:element name="GROUP" type="Group"/> 
      <!-- GL a GroupRef could remove recursion -->
    </xs:choice>
  </xs:sequence>
  <xs:attribute name="ID"   type="xs:ID"/>
  <xs:attribute name="name" type="xs:token"/>
  <xs:attribute name="ref"  type="xs:IDREF"/>
  <xs:attribute name="ucd"  type="ucdType"/>
  <xs:attribute name="utype" type="xs:string"/>
</xs:complexType>

<!-- FIELDref and PARAMref are references to FIELD or PARAM defined
     in the parent TABLE or RESOURCE -->
<!-- GL This can not be enforced in XML Schema, so why not IDREF in <Group> ?
     In particular if the UCD and utype attributes will NOT be added -->
<xs:complexType name="FieldRef">
  <xs:attribute name="ref" type="xs:IDREF" use="required"/>
  <xs:attribute name="ucd"  type="ucdType"/>
  <xs:attribute name="utype" type="xs:string"/>
</xs:complexType>

<xs:complexType name="ParamRef">
  <xs:attribute name="ref" type="xs:IDREF" use="required"/>
  <xs:attribute name="ucd"  type="ucdType"/>
  <xs:attribute name="utype" type="xs:string"/>
</xs:complexType>

<!-- DATA is the actual table data, in one of three formats -->
<!-- 
  GL in Kyoto we discussed the option of having the specific Data items 
  be subtypes of Data:
-->
<!-- 
<xs:complexType name="Data" abstract="true"/>

<xs:complexType name="TableData">
  <xs:complexContent>
    <xs:extension base="Data">
     ... etc
    </xs:extension>
  </xs:complexContent>
</xs:complexType>
 -->
<xs:complexType name="Data">
  <xs:annotation><xs:documentation>
    Added in Version 1.2: INFO for diagnostics
  </xs:documentation></xs:annotation>
  <xs:sequence>
    <xs:choice>
      <xs:element name="TABLEDATA" type="TableData"/>
      <xs:element name="BINARY" type="Binary"/>
      <xs:element name="FITS" type="FITS"/>
    </xs:choice>
    <xs:element name="INFO" type="Info" minOccurs="0" maxOccurs="unbounded"/>
  </xs:sequence>
</xs:complexType>

<!-- Pure XML data -->
<xs:complexType name="TableData">
  <xs:sequence>
    <xs:element name="TR" type="Tr" minOccurs="0" maxOccurs="unbounded"/>
  </xs:sequence>
</xs:complexType>

<xs:complexType name="Td">
  <xs:simpleContent>
    <xs:extension base="xs:string">
      <!-- xs:attribute name="ref" type="xs:IDREF"/ -->
      <xs:annotation><xs:documentation>
          The 'encoding' attribute is added here to avoid
          problems of code generators which do not properly
          interpret the TR/TD structures.
          'encoding' was chosen because it appears in
          appendix A.5
      </xs:documentation></xs:annotation>
      <xs:attribute name="encoding" type="encodingType"/>
    </xs:extension>
  </xs:simpleContent>
</xs:complexType>

<xs:complexType name="Tr">
  <xs:annotation><xs:documentation>
    The ID attribute is added here to the TR tag to avoid 
    problems of code generators which do not properly 
    interpret the TR/TD structures
  </xs:documentation></xs:annotation>
  <xs:sequence>
    <xs:element name="TD" type="Td" maxOccurs="unbounded"/>
  </xs:sequence>
  <xs:attribute name="ID" type="xs:ID"/>
</xs:complexType>

<!-- FITS file, perhaps with specification of which extension to seek to -->
<xs:complexType name="FITS">
  <xs:sequence>
    <xs:element name="STREAM" type="Stream"/>
  </xs:sequence>
  <xs:attribute name="extnum" type="xs:positiveInteger"/>
</xs:complexType>

<!-- BINARY data format -->
<xs:complexType name="Binary">
  <xs:sequence>
    <xs:element name="STREAM" type="Stream"/>
  </xs:sequence>
</xs:complexType>

<!-- STREAM can be local or remote, encoded or not -->
<xs:complexType name="Stream">
  <xs:simpleContent>
    <xs:extension base="xs:string">
      <xs:attribute name="type" default="locator">
        <xs:simpleType>
          <xs:restriction base="xs:NMTOKEN">
            <xs:enumeration value="locator"/>
            <xs:enumeration value="other"/>
          </xs:restriction>
        </xs:simpleType>
      </xs:attribute>
      <xs:attribute name="href" type="xs:anyURI"/>
      <xs:attribute name="actuate" default="onRequest">
        <xs:simpleType>
          <xs:restriction base="xs:NMTOKEN">
            <xs:enumeration value="onLoad"/>
            <xs:enumeration value="onRequest"/>
            <xs:enumeration value="other"/>
            <xs:enumeration value="none"/>
          </xs:restriction>
        </xs:simpleType>
      </xs:attribute>
      <xs:attribute name="encoding" type="encodingType" default="none"/>
      <xs:attribute name="expires" type="xs:dateTime"/>
      <xs:attribute name="rights" type="xs:token"/>
    </xs:extension>
  </xs:simpleContent>
</xs:complexType>

<!-- A TABLE is a sequence of FIELD/PARAMs and LINKS and DESCRIPTION, 
     possibly followed by a DATA section 
-->
<xs:complexType name="Table">
  <xs:annotation><xs:documentation>
    Added in Version 1.2: INFO for diagnostics
  </xs:documentation></xs:annotation>
  <xs:sequence>
    <xs:element name="DESCRIPTION" type="anyTEXT" minOccurs="0"/>
<!-- GL: why a choice iso for example -->
<!-- 
      <xs:element name="PARAM" type="Param" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element name="FIELD" type="Field" minOccurs="0" maxOccurs="unbounded"/>
      <xs:element name="GROUP" type="Group" minOccurs="0" maxOccurs="unbounded"/>
-->
<!-- 
  This could also enforce groups to be defined after the fields and params 
  to which they must have a reference, which is somewhat more logical
-->
    <!-- Added Version 1.2: -->
    <xs:element name="INFO" type="Info" minOccurs="0" maxOccurs="unbounded"/> 
    <xs:choice minOccurs="0" maxOccurs="unbounded"> 
      <xs:element name="FIELD" type="Field"/>
      <xs:element name="PARAM" type="Param"/>
      <xs:element name="GROUP" type="Group"/>
    </xs:choice>
    <xs:element name="LINK" type="Link" minOccurs="0" maxOccurs="unbounded"/>
    <xs:sequence minOccurs="0" maxOccurs="unbounded">  
      <xs:element name="DATA" type="Data"/>
      <xs:element name="INFO" type="Info" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
  </xs:sequence>
  <xs:attribute name="ID"   type="xs:ID"/>
  <xs:attribute name="name" type="xs:token"/>
  <xs:attribute name="ref"  type="xs:IDREF"/>
  <xs:attribute name="ucd"  type="ucdType"/>
  <xs:attribute name="utype" type="xs:string"/>
  <xs:attribute name="nrows" type="xs:nonNegativeInteger"/>
</xs:complexType>

<!-- RESOURCES can contain DESCRIPTION, (INFO|PARAM|COSYS), LINK, TABLEs -->
<xs:complexType name="Resource">
  <xs:annotation><xs:documentation>
     Added in Version 1.2: INFO for diagnostics in several places
  </xs:documentation></xs:annotation>
  <xs:sequence>
    <xs:element name="DESCRIPTION" type="anyTEXT" minOccurs="0"/>
    <xs:element name="INFO" type="Info" minOccurs="0" maxOccurs="unbounded"/>
    <xs:choice minOccurs="0" maxOccurs="unbounded">
      <xs:element name="COOSYS" type="CoordinateSystem"/><!-- Deprecated in V1.2 -->
      <xs:element name="GROUP" type="Group" />
      <xs:element name="PARAM" type="Param" />
    </xs:choice>
    <xs:sequence minOccurs="0" maxOccurs="unbounded">
      <xs:element name="LINK" type="Link" minOccurs="0" maxOccurs="unbounded"/>
      <xs:choice>
        <xs:element name="TABLE" type="Table" />
        <xs:element name="RESOURCE" type="Resource" />
      </xs:choice>
      <xs:element name="INFO" type="Info" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    <!-- Suggested Doug Tody, to include new RESOURCE types -->
    <xs:any namespace="##other" processContents="lax" minOccurs="0" maxOccurs="unbounded"/>
  </xs:sequence>
  <xs:attribute name="name" type="xs:token"/>
  <xs:attribute name="ID"   type="xs:ID"/>
  <xs:attribute name="utype" type="xs:string"/>
  <xs:attribute name="type" default="results">
    <xs:simpleType>
      <xs:restriction base="xs:NMTOKEN">
        <xs:enumeration value="results"/>
        <xs:enumeration value="meta"/>
      </xs:restriction>
    </xs:simpleType>
  </xs:attribute>
  <!-- Suggested Doug Tody, to include new RESOURCE attributes -->
  <xs:anyAttribute namespace="##other" processContents="lax"/>
</xs:complexType>

<!-- VOTable is the root element -->
<xs:element name="VOTABLE">
<xs:complexType>
  <xs:sequence>
    <xs:element name="DESCRIPTION" type="anyTEXT" minOccurs="0"/>
    <xs:element name="DEFINITIONS" type="Definitions" minOccurs="0"/><!-- Deprecated -->
    <xs:choice minOccurs="0" maxOccurs="unbounded">
      <xs:element name="COOSYS" type="CoordinateSystem"/><!-- Deprecated in V1.2 -->
      <xs:element name="GROUP" type="Group" />
      <xs:element name="PARAM" type="Param" />
      <xs:element name="INFO" type="Info" minOccurs="0" maxOccurs="unbounded"/>
    </xs:choice>
    <xs:element name="RESOURCE" type="Resource" minOccurs="1" maxOccurs="unbounded"/>
    <xs:element name="INFO" type="Info" minOccurs="0" maxOccurs="unbounded"/>
  </xs:sequence>
  <xs:attribute name="ID" type="xs:ID"/>
  <xs:attribute name="version">
     <xs:simpleType>
       <xs:restriction base="xs:NMTOKEN">
         <xs:enumeration value="1.2"/>
       </xs:restriction>
     </xs:simpleType>
   </xs:attribute>
</xs:complexType>
</xs:element>

</xs:schema>