VO data types
A review of the data types defined in the VO specifications.
Specifically looking at the relationships between types, attributes and columns with similar names in different standards and how they relae to each other.
VODataService
The
VODataService specification defines an XML schema for describing data collections and the services that access them.
This review refers to
version 1.1 (20101202) of the specification.
The data types defined in
VODataService are intended to be used to describe the data in VO data sets and the services and protocols used to access them.
DataType
The
DataType XML element is defined in
section 3.5 (Data Parameters) of the
VODataService specification.
DataType defines the following attributes:
The
DataType arraysize
attribute is defined in
section 3.5 (Data Parameters) of the
VODataService specification.
The specification text describes the
arraysize
attribute as follows:
- "The arraysize attribute indicates the parameter is an array of values of the named type."
- "the VOTable arraysize format (vs:ArrayShape): LxMxN..., where each x-delimited positive integer is a length along a dimension of a multi-dimensional array. A single integer indicates a one dimensional array. Instead of an integer, the last length can be set to "*" which indicates a variable length."
- "The attribute's presence indicates that parameter holds an array values; the attribute's value indicates the length of the array along each dimension of the multi-dimensional array."
DataType =delim
The
DataType delim
attribute is defined in
section 3.5 (Data Parameters) of the
VODataService specification.
The specification text describes the
delim
attribute as follows:
- "the string that is used to delimit element of an array value when arraysize is not "1""
The specification text does not define a default value for the
delim
attribute.
The specification text encourages applications to allow optional spaces before and after the delimiter (e.g. "1, 5" when delim=",").
The XML schema defines a default value as a single white space " ".
<xs:attribute name="delim" type="xs:string" default=" ">
The comments in the XML schema specification encourages applications to allow optional spaces before and after the delimiter (e.g. "1, 5" when delim=",").
The XML schema itself does not attempt to encode that in XML schema notation.
All of the examples we have found in the VO specifications use white space as the delimiter:
- VOTable uses space as the delimiter for arrays of numeric values.
- POINT
- POLYGON
The
delim
attribute is not referred to by any of the other VO specifications.
DataType =extendedType
The
DataType extendedType
attribute is defined in
section 3.5 (Data Parameters) of the
VODataService specification.
The specification text describes the
extendedType
attribute as follows:
- "The data value represented by this type can be interpreted as of a custom type identified by the value of this attribute. "
- "The name implies a particular expected format for the data value that can be parsed into a value in memory."
- " If an application does not recognize this extendedType, it should attempt to handle value assuming the type given by the element's value. "string" (or its equivalent) is a recommended default type."
- " This element may make use of the extendedSchema attribute and/or any arbitrary (qualified) attribute to refine the identification of the type. "
Looking at the body of standards as a whole, we assume that the
extendedType
attribute is functionally equivalent to the
xtype attribute defined in the
something specification.
However, as far as we can tell, this is not explicitly stated anywhere, and there in no mapping defined between the
extendedType
|
extendedSchema
attribute pair defined in
VODataService
and the
xtype attribute with a prefix defined in the
something specification.
The
VODataService specification does not provide an example of how the
extendedType
attribute could be used.
The
extendedType
attribute is not referred to by any of the other VO specifications.
DataType =extendedSchema
The
DataType extendedType
attribute is defined in
section 3.5 (Data Parameters) of the
VODataService specification.
The specification text describes the
extendedType
attribute as follows:
- "An identifier for the schema that the value given by the extended attribute is drawn from."
The
VODataService specification does not provide an example of how the
extendedSchema
attribute could be used.
The
extendedSchema
attribute is not referred to by any of the other VO specifications.
TableDataType
The
TableDataType XML element is defined in
section 3.5.3 (Table Column Data Types) of the
VODataService specification.
TableDataType extends
DataType.
VOTableType
The
VOTableType XML element is defined in
section 3.5.3 (Table Column Data Types) of the
VODataService specification.
VOTableType inherits the following attributes from
DataType:
VOTableType defines the following set of allowed values:
-
boolean
-
bit
-
unsignedByte
-
short
-
int
-
long
-
char
-
unicodeChar
-
float
-
double
-
floatComplex
-
doubleComplex
The specification text describes
VOTableType as follows :
- "data types that correspond to the parameter and column types defined in the VOTable schema"
The XML schema comments describe
VOTableType as follows :
- "a data type supported explicitly by the VOTable format".
The definition of
VOTableType does not provide any further details about the sizes, ranges or content of the data types. It is left to the reader to refer to the
VOTable specification for details about the data types.
Note - the bibliography reference to the
VOTable specification explicitly refers to
version 1.2 (20091130) of the specification, this has since been superceded by
version 1.3 (20130920).
The definition of
VOTableType states that string values of arbitrary length are represented by a data type of
char
with
arraysize="*"
. This excludes the option of using
unicodeChar
as the data type with
arraysize="*"
. It may be clearer to explicitly state
ASCII strings are represented by
char
with
arraysize="*"
and
unicode strings are represented by
unicodeChar
and
arraysize="*"
.
TAPDataType
TAPDataType is a XML element defined in the
VODataService specification that describes a base class for data types defined in the
TAP ADQL specification.
TAPDataType defines the following attributes:
Note - the XML element name reflects the historical situation where the data types were originally defined in the
TAP specification. The data type definitions have since been moved to the
ADQL specification, but for compatibility reasons, the XML element name has not been changed.
TAPDataType =size
The
size
attribute is defined as an attribute of the
TAPDataType XML element in the
VODataService specification.
The
VODataService specification describes the
size
attribute as follows:
- "The length of the variable-length data type."
- "In the context of TAP, this attribute is only meaning when the data type is CHAR or BINARY; see discussion below."
This restriction seems to imply that
CHAR
and
BINARY
values have an inherent
'size' property, and are not treated as arrays of values, which have a different
'arraysize' property.
In the discussion that follows, the
VODataService specification gives two examples which are equivalent:
<dataType xsi:type="vs:VOTableType" arraysize="*"> char </dataType>
and
<dataType xsi:type="vs:TAPType"> VARCHAR </dataType>
and a third example that describes a fixed length string, using the
size
rather than the
arraysize
attribute:
<dataType xsi:type="vs:TAPType" size="8" > CHAR </dataType>
However, the
VODataService specification does not explicitly explain the difference (if any) between:
<dataType xsi:type="vs:TAPType" size="8" > CHAR </dataType>
and
<dataType xsi:type="vs:TAPType" arraysize="8" > CHAR </dataType>
This distinction between
CHAR
,
VARCHAR
and
BINARY
values with a
'size' property, and arrays of numeric values with an
'arraysize' property
are possibly left over from previous versions of the VO specifications.
The documentation element in the XML schema for
TAPDataType describes the
size
attribute as follows:
- "This corresponds to the size Column attribute in the TAP_SCHEMA and can be used with data types that are defined with a length (CHAR, BINARY)."
This establishes a 'forward' link from
TAPDataType in the
VODataService specification to
TAP_SCHEMA.columns in the
TAP specification.
The
TAP_SCHEMA.columns table contains a
size
column. The text in the current working draft of the
TAP specification describes this column as
"retained for backwards compatibility to TAP-1.0".
The original text in version 1.0 of the
TAP specification describes the
size
column as follows :
- "The “size” gives the length of variable length datatypes, for example varchar(256);"
Neither version of the
TAP specification contain a 'backward' link between the
TAPDataType size
attribute and the
size
column in the
TAP_SCHEMA.columns table.
The
size
attribute is not referred to by any of the other VO specifications.
TAPType
The
TAPType XML element is defined in
section 3.5.3 (Table Column Data Types) of the
VODataService specification.
TAPType describes data types defined in the
TAP ADQL specification.
TAPType inherits the following attributes from
DataType:
TAPType inherits the following attributes from
TAPDataType:
TAPType defines the following set of allowed values:
-
BOOLEAN
-
SMALLINT
-
INTEGER
-
BIGINT
-
REAL
-
DOUBLE
-
TIMESTAMP
-
CHAR
-
VARCHAR
-
BINARY
-
VARBINARY
-
POINT
-
REGION
-
CLOB
-
BLOB
Notes:
TAPType is described in
section 3.5.3 of the specification.
The XML element name reflects the historical situation where the data types were originally defined in the
TAP specification. The data type definitions have since been moved to the
ADQL specification, but for compatibility reasons, the XML element name has not been changed.
The definition of
TAPType does not provide any further details about the sizes, ranges or content of the data types.
It is left to the reader to refer to the
TAP ADQL specification for details about the data types.
The text at the end of
section 3.5.3 on Table Column Data Types refers to
a mapping between
TAP_SCHEMA
types and [[#Votable][VOTable] types in the
TAP specification.
"Note that the TAP standard [TAP] defines an explicit mapping between TAP_SCHEMA types and VOTable types."
This mapping is no longer part of the
TAP specification.
The definition of
TAPType states that string values should be represented by a data type of
VARCHAR
, the definition does not say whether this should be accompanied by a =
size or
arraysize
attribute.
VOTable
#VOTable
The
VOTable specification defines a common data exchange format for tabular data within the VO.
VOTableTypes
VOTableArrays
The
VOTable specification and XML schema includes an
arraysize
attribute, but it does not include a
delim
attribute.
Section 2.2 of the
VOTable specification describes arrays of values using the
arraysize
attribute.
However, it does not mention anything about a delimiter.
Section 5.1 of the
VOTable specification describes the
TABLEDATA
serialization of arrays as follows:
"If a cell contains an array of numbers or a complex number, it should be encoded as multiple numbers separated by whitespace. However in the case of character and Unicode strings (declared in the corresponding FIELD as an array of char or unicodeChar datatype), no separator should exist."
It uses the following example to illustrate this:
<TABLE>
<FIELD name="aString" datatype="char" arraysize="10"/>
<FIELD name="aShort" datatype="short"/>
<FIELD name="varInts" datatype="int" arraysize="*"/>
<FIELD name="Floats" datatype="float"arraysize="3"/>
<DATA><TABLEDATA>
<TR> <TD>Apple</TD> <TD/> <TD>1 2 4 8 16</TD> <TD>1.62 4.56 3.44</TD> </TR>
<TR> <TD>Orange</TD> <TD>15</TD> <TD>23 -11 9</TD> <TD>2.33 4.66 9.53</TD> </TR>
</TABLEDATA></DATA>
</TABLE>
DALI
#DALI
The
DALI specification defines ...
TAP
The
TAP specification defines ...
ADQL
#ADQL
The
ADQL specification defines ...
xtype
#xtype
The
xtype
attribute is defined in ...
The
xtype
attribute is referred to in ...
TAP_SCHEMA
The
TAP_SCHEMA
tables are defined in ...
The
TAP_SCHEMA
tables are referred in ...
Proposed changes
Mark the
VODataService size
as deprecated and update documentation to reflect this.