TWiki> IVOA Web>TonyLinde>TonyOnUCDs (revision 9)EditAttach
Contents:

My understanding of UCDs

The way this came up was my question in the plenary UCD session about how we can identify columns within a table uniquely. Basically, the answer was that UCDs will not solve this problem and are not intended to do so. This page will summarise what I now understand as the purpose of UCDs and some of the implications of this.

UCDs as Data Types

Comment was made that UCDs can be considered as data types, so a column in a table has a data type of, POS_EQ_RA, say. I assume that the reasons for having UCDs as data types are to allow:

  • operations on columns: comparison, addition, subtraction, multiplication, etc plus specific astronomical operations
  • conversion between data types: eg converting between equitorial and galactic coordinates

Do we thus need (or already have) some hierarchical structure of the UCDs based on allowable operations? In normal data types, we have numerical types, subdivided by integral and floating point, subdivided by storage size etc; one can add all numerical types but (generally) cannot add a number and a string (without pre-defining what such an addition will do).

Aligned to that: should we define the operations that can be performed on the individual data types (UCDs), the rules for those operations given specific types, and the type resulting from such operations.

UCDs as Keywords

In this context, the UCDs is part of the metadata for a table. It indicates the type of data held in a table, so having POS_EQ_RA identified with a table says that this table includes positional data in equitorial coordinates. That said, maybe the UCD for the table should include POS_EQ instead (since it is unlikely that it'll have RA without DEC).

So the idea of being able to query which resources have POS_EQ* makes sense.

Unique Column Identification

Given that we cannot use UCDs as unique column identifiers, how do we do this?

It seems that the only possible unique identifier for a column in a table is the resourceID of the table (from the Registry) plus the columnName (for explanation of resourceID, see the discussion on this in the Registry mailing list: http://www.ivoa.net/forum/registry/0091.htm and related messages).

So, to summarise the discussion from the plenary session, a query can be sent to a table with either UCDs or column names or a mixture of both. If a UCD is included in a query, the data source can resolve this if there is only one column with that UCD or there are multiple columns but one has the modifier MAIN attached to only one of the column UCDs. Otherwise the query will fail.

Example query

A possible query structure (using xml-ised SQL) would be:

<query> <from> <resourceID asName="cat1"> <authorityID>...</authorityID> <resourceKey>...</resourceKey> </resourceID> <resourceID asName="cat2"> <authorityID>...</authorityID> <resourceKey>...</resourceKey> </resourceID> </from> <select> <field asName="pos-ra" ucd="POS_EQ_RA" /> <field ucd="POS_EQ_DEC"> <useColumn columnName="DEJ2000" inResource="cat2" /> </field> <field ucd="..." /> </select> <where> ... </where> </query>

The asName attribute allows the possibility of referring to an item later in the query structure. The ucd attribute is obvious. The key aspect of the above query is the inclusion of the <useColumn ...> tag within the field tag allowing for identification of columns where the UCD is not unique.

Edit | Attach | Print version | History: r11 < r10 < r9 < r8 < r7 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r9 - 2003-05-16 - TonyLinde
 
This site is powered by the TWiki collaboration platformCopyright © 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback