STC in VOTable, Discus"> Embedding STC in VOTable, Discussion Archive
This is an archive of things that were once on
STCInVOTable. See there
for details. Please append newer things at the bottom.
Change 1: Reverse References
Instead of having utype and ref on FIELD, put groups into the AstroCoords group:
<GROUP ID="lltoush_coo" ref="lltoush"
utype="stc:AstroCoords">
<GROUP ref="alpha"
utype="stc:AstroCoords.Position2D.Value2.C1" />
<GROUP ref="rv"
utype="stc:AstroCoords.Redshift.Value" />
</GROUP>
(or use FIELDrefs that, I'm told, can now take utypes as well).
Rationale:
- Keep STC information confined to STC groups (helps libraries)
- Don't clobber utype and ref on FIELDs to preserve them for other, less generic purposes
Impact on Functionality:
As far as I can see, None. You need one AstroCoords group per what
set of coordinates either way. -- MD
Comments
I agree with this proposed change. As a matter of fact, it is the way
STC was intended to function in VOTable (albeit as an imported schema, not through utypes). See the examples I post at the bottom of the page.
--
ArnoldRots - 01 Dec 2009
I agree and think that FIELDref-s SHOULD be used. Is most logical way to add extra information about a field. The utype on the field is then freed up for pointing into other possibly more samntically meaningful models, such as "is a position of a galaxy".
--
GerardLemson - 11 Mar 2010
Change 2: Flat systems
Just have all utype/value (belonging to one coordinate system definition)
params as direct children of the AstroCoordSystem group.
<GROUP ID="lltoush" utype="stc:AstroCoordSystem">
<PARAM arraysize="*" datatype="char" value="VELOCITY"
utype="stc:AstroCoordSystem.RedshiftFrame.value_type" />
<PARAM arraysize="*" datatype="char" value="ICRS"
utype="stc:AstroCoordSystem.SpaceFrame.CoordRefFrame" />
</GROUP>
Rationale
- Flat is better than nested (try python -c 'import this').
- The additional nesting adds no information, probably don't really help implementations or humans on parsing and complicate writing.
Impact on Functionality
None that I can see. Were these groups meant a service for humans? -- MD
Comment
I don't think this will work, except for the simplest tables.It does not allow for multiple coordinate systems, reusing coordinate systems, or using elements that contain AstroCoordSystem elements. See the CSC example that I will be posting at the bottom of the page.
--
ArnoldRots - 01 Dec 2009
Uh -- I notice I was not particularly clear. There is one group each for every
AstroCoordSystem, of course. I'm just suggesting to drop the subgroups
within (XFrame). For the CSC example, I can't see where that would fail, and
actually, it should not in any setting, by virtue of the data model requriring
zero or one of each XFrame and the frame name being a part of the utype already. -- MD
2009-12-02
Change 3a: Do not abuse xml namespace declarations
Don't pretend the stc: in the utype has anything to do with an XML namespace.
So, strike the xmlns:stc declaration on VOTABLE:
<VOTABLE version="1.2" xmlns:xsi="http://www.w3.org/2
xmlns="http://www.ivoa.net/xml/VOTable/v1.2">
Rationale
- While syntactically legal, declaring namespaces that are not used within the document is a dangerous practice -- XML tools can and do discard these. Also, the stc in the namespace declaration has, from an XML point of view, nothing to do with the stc in the utype attribute value since that value is not declared as to hold a QName.
- The "package name" is supposed to be the fixed thing (to keep utypes opaque). This is incompatible with XML namespaces.
Impact on Functionality
The VOTable has no way to define which version of the
STC data model the
utypes refer to.
I would say this is desirable since versioned meanings
will lead to hell either way, but see Change 3b for a fix. -- MD
Comment
I have no(t yet an) opinion on this. It does sound reasonable.
--
ArnoldRots - 01 Dec 2009
Change 3b: Define DM Version using UCDs
In every AstroCoordSystem group, declare what version of the DM you are
using. We may make that optional or a strong recommendation or something like
that.
The version of the AstroCoords group would be implied via its ref.
<GROUP ID="lltoush" utype="stc:AstroCoordSystem">
<PARAM utype="stc:" value="http://.../stc-v1.30#"/>
<PARAM arraysize="*" datatype="char" value="VELOCITY"
utype="stc:AstroCoordSystem.RedshiftFrame.value_type" />
Rationale
- This provides a link to exact data model used to define the utypes used.
- Some mechanism like this will be employed by the utype group.
Impact on Functionality
- We/someone should maintain explanations for all the utypes at the URLs resulting from glueing together the model URI and the de-packaged utype.
Comment
Sounds reasonable; but you need a name and a datatype as well.
--
ArnoldRots - 01 Dec 2009
Change 4: Only allow string values
Define that all
STC PARAMs are
datatype="char" arraysize="*"
.
Rationale
- As far as I can see, there are no very reliable serialization rules for param values in VOTable anyway -- MD
- Provides the easiest way to unambiguously define the utype serialization by pointing to the STC-X schema. -- MD
- Without this, libraries have to keep a mapping from "known utypes" to their types. This is not hard, but not very nice either. We'd have to derive the type/serialization rules from STC-X either way. -- MD
Impact on Functionality
- It's much easier to pass STC info through correctly, e.g., if a tool only understands a subset of STC.
- For tools knowing a certain utype, probably none; they'll have some custom way of de-/serializing their internal values anyway. -- MD
Comment
I haven't thought about the repercussions of this, yet. On the face of it, it sounds not unreasonable, but on the other hand, since the data type has to be given as a parameter, I don't see allowing more data types as much of a complication. I only wish that PARAM were more reasonable in the types it allows - particularly 'string' would be useful - and, of course, ISO-8601.
--
ArnoldRots - 01 Dec 2009
CSC Cone Serch Examples
As it so happened, I had recently prepared the
STC-specific stuff for a VOTable 1.1/1.2 that presents data returned by a simple cone search query to the Chandra Source Catalog. Then I modified that one to comply with Changes 1 and 3 above. There is nothing like a real life example to bring out the problems
I think it shows what is problematic about Change 2. Here are the
Version 1.1 and the
MD-modified Version of the example.
--
ArnoldRots - 01 Dec 2009
Revision, Draft 1
I have prepared a revision of the note, and while doing so I realized that
things become quite a bit simpler if the utype-value pairs are serialized into
INFOs rather than PARAMs. Otherwise, it more or less reflects the changes
proposed here. You check out the document from svn at
http://svn.ari.uni-heidelberg.de/svn/gavo/stcvotable/trunk/ (read-only). For commit privileges, contact me.
--
MarkusDemleitner - 21 Jan 2010
Revision, Draft 2
After some feedback from Arnold, I've prepared a second draft. Some contentious points remaining below.
The current draft is at
http://vo.ari.uni-heidelberg.de/docs/note_stc.pdf (and the (ugly) sources are still in the svn mentioned above).
--
MarkusDemleitner - 17 Feb 2010
Issues left in Draft 2
STC container
Arnold's suggestion is to give the AstroCoordSystem and the AstroCoords groups have a common parent. So, instead of having
<group utype="stc:AstroCoordSystem" id="sys1"/>
[other stuff]
<group utype="stc:AstroCoords" ref="sys1"/>
you would have
<group utype="???">
<group utype="stc:AstroCoordSystem" id="sys1"/>
<group utype="stc:AstroCoords" ref="sys1"/>
</group>
Still, constructs like
<group utype="???">
<group utype="stc:AstroCoordSystem" id="sys1"/>
</group>
[other stuff]
<group utype="???">
<group utype="stc:AstroCoords" ref="sys1"/>
</group>
would be allowed.
Comment Markus: I don't really like this -- It creates an additional element for
no apparent benefit since you still need to resolve the references. If, on the
other hand, we'd abandon referencing completely, it would definitely be worth
it, but people may resent the idea of not being able to reuse coordinate system
definitions within a VOTable (though probably only a small fraction of the
existing VOTables would actually suffer from not being able to do so). So,
from me:
Either referencing
or top-level
STC container.
Comment Arnold: It keeps the
STC stuff neatly together and if there ever is a
need to add the observer's location, it can be done. It does make it
easier to interpret the information in terms of an
STC metadata object, which
will come in handy when we finally have an
STC library. And I don't
particularly care for scattered metadata.
Comment
Inventing an additional parent element doesn't look necessary to me - especially since you are still allowed to scatter the information and put the AstroCoords and AstroCoordSystem information separately. I vote for the version without a parent.
--
KristinRiebe - 10 Mar 2010
Comment Arnold: I wasn't suggesting that scattering outside the containers be allowed, so Markus's second point does not hold water.
--
ArnoldRots - 16 Mar 2010
Comment Markus: So, Arnold: You are fine with dropping the id/ref mechanism
and forbid referencing of AstroCoordSystem groups across your container
groups? I'd like that a lot, and in that case I'd propose to
just flatly dump all utypes belonging to that coordinate system
into one group.
That would really make implementations
a lot easier.
What do the others think? -- 2010-03-31
Epoch
Should an epoch like B1950.0 be encoded as
<info utype="stc:AstroCoords.Position.Epoch" value="B1950.0"/>
or as
<info utype="stc:AstroCoords.Position.Epoch" value="1950.0"/>
<info utype="stc:AstroCoords.Position.Epoch.whatever" value="B"/>
Comment Arnold: Epoch is a number, not a string. If there were a limited
number of values, one might consider to represent them with an enumerated list
of strings, but that is not the case. It is a foolish hack to represent a
numeric value with a string parameter; this a properly a numeric quantity with
an attribute that says whether it is Julian or Besselian.
Comment Markus: Splitting that perfectly understandable literal has negligible
benefits at considerable cost. Plus, the votable schema already contains an
appropriate type (astroYear). So, I can see no reason to double the amount
of serialization and handling effort.
Comment
I agree with Markus - aren't Astronomical epochs practically always written with a leading character? So they should be defined as type astroYear and thus no confusion with numbers/strings can occur. Besides, it looks more concise and simpler to understand.
--
KristinRiebe - 10 Mar 2010
Comment Arnold: This is a weak argument - why don't we consider everything a string and do away with all other datatypes? The fact that VOTable has a, in my opinion, poor representation for epochs does not mean that all other standards need to use the same.
--
ArnoldRots - 16 Mar 2010
Comment Markus: We have types to denote the domains of variables and
define the operators available for them. The domain of astronomical epochs
represented by literals matching [JB][0-9]+(.[0-9]+) for quite some time now.
By the way, even in the operators you see that astronomical epochs just
aren't floats. If you say ~AstroCoords.Position.Epoch is a float, shouldn't I
be perfectly entitled to just add them? And: What is the
concrete
utility of splitting the value? Does it
reduce implementation effort? Will it help code correctness? --- 2010-03-31
Referencing
Should the referencing between AstroCoords groups and AstroCoordSystem
groups be done using VOTable referencing, viz.,
<group utype="stc:AstroCoordSystem" id="sys1"/>
[other stuff]
<group utype="stc:AstroCoords" ref="sys1"/>
or rather using utypes, viz.,
<group utype="stc:AstroCoordSystem">
<info utype="stc:AstroCoordSystem.id" value="sys1"/>
</group>
[other stuff]
<group utype="stc:AstroCoords">
<info utype="stc:AstroCoords.coord_system_id" value="sys1"/>
</group>
Comment Arnold: If
STC provides a referencing mechanism to tie its components
together, that should be used to do so, not a VOTable mechanism. And if you
encapsulate the whole thing in an
STC container (see above), it is the natural,
neatly self-contained way to do it.
Comment Markus:
STC doesn't really provide a referencing mechanism. There's
some mechanism in
STC-X but, e.g., none in
STC-S. We really, really should use
native referencing. Referencing is messy to get right without additional
complications of having two different identifier systems (e.g., you need to get
referential integrity and uniqueness right, and you need to catch cases when
they are violated, and you need to tell the user that something went wrong,
etc). Self-containedness is nice, but
not at the cost of doubling the
implementation effort in a tricky spot. So: Since we're writing VOTables, we
should be using VOTable's referencing.
Comment
I vote for the first version - since VOTable's referencing system is doing a good job here, more complications (even if they could achieve self-containedness) are not necessary.
--
KristinRiebe - 10 Mar 2010
Comment Arnold: With all dues respect, Markus's argument is incorrect. In
STC-S the referencing is defined (implied) in the syntax and the referencing is part of the standard. Since we are connecting
STC elements together, we should be using
STC's referencing mechanism.
--
ArnoldRots - 16 Mar 2010
Comment Markus: The argument about a significantly increased implementation
effort is hardly incorrect -- you'll have to write all the code to handle
the parallel referencing, don't you? But either way: Arnold, you still
have not pointed out what the use of a second, parallel referencing is.
Does it prevent errors? Will it help readers or writers? Does it
decrease the file size? -- 2010-03-31
Time utypes
Yet another thing I'd like to change is the representation of the encoding
in the time utypes. As the standards is written now, there are utypes
...TimeInstant.ISOTime, ...TimeInstant.JDTime, and ...TimeInstant.MJDTime
and analogously for some other such utypes.
That is bad for at least three reasons:
- clients have a hard time figuring out what column a time or a time error or whatever is (since they need to check at least three utypes)
- we have xtype in VOTable for this purpose, and having both utype and xtype specify something then requires some rules on what to do when the attribute contradict or if there is some inference between them
- data models should IMHO not be concernded with serializations
So -- what's to be done?
- Keep everything as is (ugly, but probably workable; declare clashes between utype and xtype as undefined)
- Elide ISO/JD/MJDTime by special rule (but that's yet another special rule, and you cannot build an STC-X tree from utypes any more)
- Change STC-X (might be used to clean up some other ugly spots we have here, but that's a major undertaking and would delay STC-in-VOTable significantly)
- ???
--
MarkusDemleitner - 12 Apr 2010