Embedding STC in VOTable, Discussion Archive

This is an archive of things that were once on STCInVOTable. See there for details. Please append newer things at the bottom.

Change 1: Reverse References

Instead of having utype and ref on FIELD, put groups into the AstroCoords group:

<GROUP ID="lltoush_coo" ref="lltoush"
     utype="stc:AstroCoords">
   <GROUP ref="alpha"
     utype="stc:AstroCoords.Position2D.Value2.C1" />
   <GROUP ref="rv"
     utype="stc:AstroCoords.Redshift.Value" />
 </GROUP>

(or use FIELDrefs that, I'm told, can now take utypes as well).

Rationale:

  • Keep STC information confined to STC groups (helps libraries)
  • Don't clobber utype and ref on FIELDs to preserve them for other, less generic purposes

Impact on Functionality:

As far as I can see, None. You need one AstroCoords group per what set of coordinates either way. -- MD

Comments

I agree with this proposed change. As a matter of fact, it is the way STC was intended to function in VOTable (albeit as an imported schema, not through utypes). See the examples I post at the bottom of the page.

-- ArnoldRots - 01 Dec 2009

I agree and think that FIELDref-s SHOULD be used. Is most logical way to add extra information about a field. The utype on the field is then freed up for pointing into other possibly more samntically meaningful models, such as "is a position of a galaxy".

-- GerardLemson - 11 Mar 2010

Change 2: Flat systems

Just have all utype/value (belonging to one coordinate system definition) params as direct children of the AstroCoordSystem group.

<GROUP ID="lltoush" utype="stc:AstroCoordSystem">
  <PARAM arraysize="*" datatype="char" value="VELOCITY"
     utype="stc:AstroCoordSystem.RedshiftFrame.value_type" />
  <PARAM arraysize="*" datatype="char" value="ICRS"
     utype="stc:AstroCoordSystem.SpaceFrame.CoordRefFrame" />
</GROUP>

Rationale

  • Flat is better than nested (try python -c 'import this').
  • The additional nesting adds no information, probably don't really help implementations or humans on parsing and complicate writing.

Impact on Functionality

None that I can see. Were these groups meant a service for humans? -- MD

Comment

I don't think this will work, except for the simplest tables.It does not allow for multiple coordinate systems, reusing coordinate systems, or using elements that contain AstroCoordSystem elements. See the CSC example that I will be posting at the bottom of the page.

-- ArnoldRots - 01 Dec 2009

Uh -- I notice I was not particularly clear. There is one group each for every AstroCoordSystem, of course. I'm just suggesting to drop the subgroups within (XFrame). For the CSC example, I can't see where that would fail, and actually, it should not in any setting, by virtue of the data model requriring zero or one of each XFrame and the frame name being a part of the utype already. -- MD 2009-12-02

Change 3a: Do not abuse xml namespace declarations

Don't pretend the stc: in the utype has anything to do with an XML namespace.

So, strike the xmlns:stc declaration on VOTABLE:

<VOTABLE version="1.2" xmlns:xsi="http://www.w3.org/2
 xmlns="http://www.ivoa.net/xml/VOTable/v1.2">

Rationale

  • While syntactically legal, declaring namespaces that are not used within the document is a dangerous practice -- XML tools can and do discard these. Also, the stc in the namespace declaration has, from an XML point of view, nothing to do with the stc in the utype attribute value since that value is not declared as to hold a QName.
  • The "package name" is supposed to be the fixed thing (to keep utypes opaque). This is incompatible with XML namespaces.

Impact on Functionality

The VOTable has no way to define which version of the STC data model the utypes refer to. I would say this is desirable since versioned meanings will lead to hell either way, but see Change 3b for a fix. -- MD

Comment

I have no(t yet an) opinion on this. It does sound reasonable.

-- ArnoldRots - 01 Dec 2009

Change 3b: Define DM Version using UCDs

In every AstroCoordSystem group, declare what version of the DM you are using. We may make that optional or a strong recommendation or something like that.

The version of the AstroCoords group would be implied via its ref.

<GROUP ID="lltoush" utype="stc:AstroCoordSystem">
  <PARAM utype="stc:" value="http://.../stc-v1.30#"/>
  <PARAM arraysize="*" datatype="char" value="VELOCITY"
     utype="stc:AstroCoordSystem.RedshiftFrame.value_type" />

Rationale

  • This provides a link to exact data model used to define the utypes used.
  • Some mechanism like this will be employed by the utype group.

Impact on Functionality

  • We/someone should maintain explanations for all the utypes at the URLs resulting from glueing together the model URI and the de-packaged utype.

Comment

Sounds reasonable; but you need a name and a datatype as well.

-- ArnoldRots - 01 Dec 2009

Change 4: Only allow string values

Define that all STC PARAMs are datatype="char" arraysize="*".

Rationale

  • As far as I can see, there are no very reliable serialization rules for param values in VOTable anyway -- MD
  • Provides the easiest way to unambiguously define the utype serialization by pointing to the STC-X schema. -- MD
  • Without this, libraries have to keep a mapping from "known utypes" to their types. This is not hard, but not very nice either. We'd have to derive the type/serialization rules from STC-X either way. -- MD

Impact on Functionality

  • It's much easier to pass STC info through correctly, e.g., if a tool only understands a subset of STC.
  • For tools knowing a certain utype, probably none; they'll have some custom way of de-/serializing their internal values anyway. -- MD

Comment

I haven't thought about the repercussions of this, yet. On the face of it, it sounds not unreasonable, but on the other hand, since the data type has to be given as a parameter, I don't see allowing more data types as much of a complication. I only wish that PARAM were more reasonable in the types it allows - particularly 'string' would be useful - and, of course, ISO-8601.

-- ArnoldRots - 01 Dec 2009

CSC Cone Serch Examples

As it so happened, I had recently prepared the STC-specific stuff for a VOTable 1.1/1.2 that presents data returned by a simple cone search query to the Chandra Source Catalog. Then I modified that one to comply with Changes 1 and 3 above. There is nothing like a real life example to bring out the problems smile I think it shows what is problematic about Change 2. Here are the Version 1.1 and the MD-modified Version of the example.

-- ArnoldRots - 01 Dec 2009


Revision, Draft 1

I have prepared a revision of the note, and while doing so I realized that things become quite a bit simpler if the utype-value pairs are serialized into INFOs rather than PARAMs. Otherwise, it more or less reflects the changes proposed here. You check out the document from svn at http://svn.ari.uni-heidelberg.de/svn/gavo/stcvotable/trunk/ (read-only). For commit privileges, contact me.

-- MarkusDemleitner - 21 Jan 2010

Revision, Draft 2

After some feedback from Arnold, I've prepared a second draft. Some contentious points remaining below.

The current draft is at http://vo.ari.uni-heidelberg.de/docs/note_stc.pdf (and the (ugly) sources are still in the svn mentioned above).

-- MarkusDemleitner - 17 Feb 2010

Issues left in Draft 2

STC container

Arnold's suggestion is to give the AstroCoordSystem and the AstroCoords groups have a common parent. So, instead of having

<group utype="stc:AstroCoordSystem" id="sys1"/>
[other stuff]
<group utype="stc:AstroCoords" ref="sys1"/>

you would have

<group utype="???">
<group utype="stc:AstroCoordSystem" id="sys1"/>
<group utype="stc:AstroCoords" ref="sys1"/>
</group>

Still, constructs like

<group utype="???">
<group utype="stc:AstroCoordSystem" id="sys1"/>
</group>
[other stuff]
<group utype="???">
<group utype="stc:AstroCoords" ref="sys1"/>
</group>

would be allowed.

Comment Markus: I don't really like this -- It creates an additional element for no apparent benefit since you still need to resolve the references. If, on the other hand, we'd abandon referencing completely, it would definitely be worth it, but people may resent the idea of not being able to reuse coordinate system definitions within a VOTable (though probably only a small fraction of the existing VOTables would actually suffer from not being able to do so). So, from me: Either referencing or top-level STC container.

Comment Arnold: It keeps the STC stuff neatly together and if there ever is a need to add the observer's location, it can be done. It does make it easier to interpret the information in terms of an STC metadata object, which will come in handy when we finally have an STC library. And I don't particularly care for scattered metadata.

Comment

Inventing an additional parent element doesn't look necessary to me - especially since you are still allowed to scatter the information and put the AstroCoords and AstroCoordSystem information separately. I vote for the version without a parent.

-- KristinRiebe - 10 Mar 2010

Comment Arnold: I wasn't suggesting that scattering outside the containers be allowed, so Markus's second point does not hold water.

-- ArnoldRots - 16 Mar 2010

Comment Markus: So, Arnold: You are fine with dropping the id/ref mechanism and forbid referencing of AstroCoordSystem groups across your container groups? I'd like that a lot, and in that case I'd propose to just flatly dump all utypes belonging to that coordinate system into one group. That would really make implementations a lot easier. What do the others think? -- 2010-03-31

Epoch

Should an epoch like B1950.0 be encoded as

<info utype="stc:AstroCoords.Position.Epoch" value="B1950.0"/>

or as

<info utype="stc:AstroCoords.Position.Epoch" value="1950.0"/>
<info utype="stc:AstroCoords.Position.Epoch.whatever" value="B"/>

Comment Arnold: Epoch is a number, not a string. If there were a limited number of values, one might consider to represent them with an enumerated list of strings, but that is not the case. It is a foolish hack to represent a numeric value with a string parameter; this a properly a numeric quantity with an attribute that says whether it is Julian or Besselian.

Comment Markus: Splitting that perfectly understandable literal has negligible benefits at considerable cost. Plus, the votable schema already contains an appropriate type (astroYear). So, I can see no reason to double the amount of serialization and handling effort.

Comment

I agree with Markus - aren't Astronomical epochs practically always written with a leading character? So they should be defined as type astroYear and thus no confusion with numbers/strings can occur. Besides, it looks more concise and simpler to understand.

-- KristinRiebe - 10 Mar 2010

Comment Arnold: This is a weak argument - why don't we consider everything a string and do away with all other datatypes? The fact that VOTable has a, in my opinion, poor representation for epochs does not mean that all other standards need to use the same.

-- ArnoldRots - 16 Mar 2010

Comment Markus: We have types to denote the domains of variables and define the operators available for them. The domain of astronomical epochs represented by literals matching [JB][0-9]+(.[0-9]+) for quite some time now. By the way, even in the operators you see that astronomical epochs just aren't floats. If you say ~AstroCoords.Position.Epoch is a float, shouldn't I be perfectly entitled to just add them? And: What is the concrete utility of splitting the value? Does it reduce implementation effort? Will it help code correctness? --- 2010-03-31

Referencing

Should the referencing between AstroCoords groups and AstroCoordSystem groups be done using VOTable referencing, viz.,

<group utype="stc:AstroCoordSystem" id="sys1"/>
[other stuff]
<group utype="stc:AstroCoords" ref="sys1"/>

or rather using utypes, viz.,

<group utype="stc:AstroCoordSystem">
   <info utype="stc:AstroCoordSystem.id" value="sys1"/>
</group>
[other stuff]
<group utype="stc:AstroCoords">
   <info utype="stc:AstroCoords.coord_system_id" value="sys1"/>
</group>

Comment Arnold: If STC provides a referencing mechanism to tie its components together, that should be used to do so, not a VOTable mechanism. And if you encapsulate the whole thing in an STC container (see above), it is the natural, neatly self-contained way to do it.

Comment Markus: STC doesn't really provide a referencing mechanism. There's some mechanism in STC-X but, e.g., none in STC-S. We really, really should use native referencing. Referencing is messy to get right without additional complications of having two different identifier systems (e.g., you need to get referential integrity and uniqueness right, and you need to catch cases when they are violated, and you need to tell the user that something went wrong, etc). Self-containedness is nice, but not at the cost of doubling the implementation effort in a tricky spot. So: Since we're writing VOTables, we should be using VOTable's referencing.

Comment

I vote for the first version - since VOTable's referencing system is doing a good job here, more complications (even if they could achieve self-containedness) are not necessary.

-- KristinRiebe - 10 Mar 2010

Comment Arnold: With all dues respect, Markus's argument is incorrect. In STC-S the referencing is defined (implied) in the syntax and the referencing is part of the standard. Since we are connecting STC elements together, we should be using STC's referencing mechanism.

-- ArnoldRots - 16 Mar 2010

Comment Markus: The argument about a significantly increased implementation effort is hardly incorrect -- you'll have to write all the code to handle the parallel referencing, don't you? But either way: Arnold, you still have not pointed out what the use of a second, parallel referencing is. Does it prevent errors? Will it help readers or writers? Does it decrease the file size? -- 2010-03-31

Time utypes

Yet another thing I'd like to change is the representation of the encoding in the time utypes. As the standards is written now, there are utypes ...TimeInstant.ISOTime, ...TimeInstant.JDTime, and ...TimeInstant.MJDTime and analogously for some other such utypes.

That is bad for at least three reasons:

  1. clients have a hard time figuring out what column a time or a time error or whatever is (since they need to check at least three utypes)
  2. we have xtype in VOTable for this purpose, and having both utype and xtype specify something then requires some rules on what to do when the attribute contradict or if there is some inference between them
  3. data models should IMHO not be concernded with serializations

So -- what's to be done?

  1. Keep everything as is (ugly, but probably workable; declare clashes between utype and xtype as undefined)
  2. Elide ISO/JD/MJDTime by special rule (but that's yet another special rule, and you cannot build an STC-X tree from utypes any more)
  3. Change STC-X (might be used to clean up some other ugly spots we have here, but that's a major undertaking and would delay STC-in-VOTable significantly)
  4. ???

-- MarkusDemleitner - 12 Apr 2010

Topic revision: r1 - 2010-04-20 - MarkusDemleitner
 
This site is powered by the TWiki collaboration platformCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback