Spectral v2.0: Proposed Recommendation: Request for CommentsThis page contains public discussion of the Spectral 2.0 Proposed Recommendation; latest version | ||||||||
Added: | ||||||||
> > | For the page containing round 2 pass through RFC review see: | |||||||
Reference Interoperable ImplementationsSpectral 2.0 has been implemented at:
Comments from the IVOA Community during RFC period: 13 May 2014 - 31 July 2014In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your Wiki Name so that authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment. Additional discussion about any of the comments or responses can be conducted on the WG mailing list. However, please be sure to enter your initial comments here for full consideration in any future revisions of this document
Comments from TCG member during the TCG Review Period: 1 August 2014 - 15 September 2014WG chairs or vice chairs must read the Document, provide comments if any and formally indicate if they approve or not the Standard. IG chairs or vice chairs are also encouraged to do the same, although their inputs are not compulsory.
TCG Chair & Vice Chair ( _Séverin Gaudet, Matthew Graham )2 comments:
-- SeverinGaudet - 2014-10-11 Applications Working Group ( _Pierre Fernique, Tom Donaldson )Approved -- PierreFernique - 2015-02-09 Data Access Layer Working Group ( François Bonnarel, Marco Molinaro )Approved -- MarcoMolinaro - 2014-09-15 Data Model Working Group ( _Jesus Salgado, Omar Laurino )Approved -- JesusSalgado - 2014-09-10
Grid & Web Services Working Group ( André Schaaff, Andreas Wicenec )Approved -- AndreSchaaff - 2014-09-30
Registry Working Group ( _Markus Demleitner, Pierre Le Sidaner )First off, apologies for coming with this so late. Many of the following comments I should have made in public RFC as an implementor; alas, I was busy elsewhere. I certainly don't want to abuse my position as TCG member to push them in; the points marked with ( ), however, are I believe appropriate for a TCG review. Disregard the others as you see fit, except if the document makes another round through public RFC (which I, frankly, would prefer). My main concern is: Implementations. The RFC page lists speclib on the client side and TSAP on the server side -- as the model is so large, I'd like to get an idea what part of the standard these actually use/support, and in what sense interoperability has been shown. I am a bit surprised that TSAP is a reference implementation, when 7.1.2 requires a 'SpatialAxis'. On the mailing list, the rationale for this was that the model was meant for observational spectra. As a baseline, I think there should be at least example serialisations using all features provided by the DM () Are there any other experiences with the data model and, in particular, its serialisations? [With my GAVO hat on: I'm offering an attempt at a partial server-side implementation until the end of the year]. Also, is there any plan to make a validator for at least the VOTable serialisation? If speclib really reads a significant portion of this, couldn't that be used to make some sort of validator? ( ) If I remember correctly, one reason this was considered relatively urgent was that it was hoped it would show how to serialise time series, in line with our current time domain priority. Now, though time series are hinted at in some places, they are not really worked out as far as I can tell -- wouldn't it be economical to add the few pages here, next to the two other concrete types (Spectrum and Photometry Point)? It'd probably be faster than start another effort in another document.
Other general remarks:
| ||||||||
Changed: | ||||||||
< < | On to more specific issues, by section header: | |||||||
> > | On to more specific issues, by section header: | |||||||
Changed: | ||||||||
< < | ||||||||
> > | ||||||||
Deleted: | ||||||||
< < | ||||||||
MCD: I'm not sure if this is the right way to respond.. but I'll make comments in-line (in green), especially the () items. As I mentioned in Banff, there was never an attempt or requirement to 'correct' the descriptions of the metadata over the earlier Spectral document(s), though I did some cleanup and comparison against obscore at the time. The Dataset Metadata document for the Cube work has had a much more thorough scrubbing and cleanup. Many of these comments are very helpful for that effort, but I'm not sure how much time and effort should be put into this document given that: 1) the content is entirely consistent with previous versions 2) will be revised after the cube work for consistency with that model framework. MCD-20150206: The document will certainly be up for revision post-Cube, how long that will be, I can't say. The concensus was that this version provided enough benefit and interest from the community, to move forward. 1.3 Use cases The section is empty -- while that's regrettable, at least the headline should be removed. ( ) MCD: I will restore the use case section from the previous doc, and see if new cases should be added.
MCD-20150206: Done 2.1.2 'Dataset.DataProductSubtype': I can see the motivation of leaving the vocabulary here open -- but without further indication of the intended use, if only by example, it's hard to figure out what one could possibly put here. MCD-20150206: Done - enhanced description, and added examples 2.1.3 'Dataset.CalibLevel' 0 here appears to mean "not in a standard format" -- in what circumstances could that turn up? If a dataset has a way to communicate this, isn't it in a standard format already? It'd also help if the text provided indication of (let's say, an example or two for) what 3 ("Enhanced") is intended to mean. MCD-20150206: Done - enhanced description, and added examples 2.4.3 'Curation.PublisherDID': String I'm not quite happy with "may be an internal ID" here. Common understanding, I believe, has been that PubDIDs should be VO global and IVORNs. I'd much rather have:
The 'PublisherDID' is a VO-global identifier of the dataset as assigned by the publisher. The recommended form is ?, e.g., ivo://example.net/imageservice?2013/5/2342. Other schemes, for instance using the authority ID as basis, are allowed, too. Note that the part in front of the question mark must be resovable in the VO Registry.
After I've proceeded to 2.6., I'm now doubtful of this: Isn't this what 'DataId'.datasetID is? IMHO there should be some explanation on the relation here. () MCD: I will see what I can do to help clarify the distinction. MCD-20150206: Done - enhanced description, not quite as specifically as your text, but clearly states the values must be a valid IVOA identifier. 2.4.7 Curation.References: String I'm fairly unhappy with keeping this so generic. The way this is written, people will dump any old string in there, glueing together different references with characters an implementation has no way of figuring out. Couldn't we write ( ):
| ||||||||
Changed: | ||||||||
< < | 2.4.7 Curation.Reference: String [Singular!] | |||||||
> > | 2.4.7 Curation.Reference: String [Singular!] | |||||||
Added: | ||||||||
> > | ||||||||
One or more bibliographic references associated with the datset. Applications might use these to suggest what works to reference when a dataset is used. To allow for automatic processing, values should be either bibcodes (discernable to the client as 19-character strings beginning with four digits) or DOIs (discernable to the client by their prefix "doi:"). Freetext references are allowed but discouraged.
The containing element can occur multiple times. Do not combine multiple references into one value.
MCD: Perhaps the notation isn't clearly conveyed. The convention in this doc (and the Cube) is for attributes with multiplicity >1 to be plural. If you consider the attribute, it holds the 'references'. Each instance is a singular reference which could be described as you suggest. So, the convention used is in question, and if changed, would be done across the board. Another option would be to show the type as "String[]". MCD-20150206: Done - The attribute name(s) are now singular, the text states that the multiplicity of zero or more. The description rather speaks to the serialization, but hopefuly clarifies things. | ||||||||
Deleted: | ||||||||
< < | 2.6. | |||||||
Added: | ||||||||
> > | 2.6. | |||||||
There's an "IVAO" (rather than "IVOA") here. ()
MCD:will fix. MCD-20150206: Done 2.6.3 'DataID'.Collections Again, I'd suggest to make this singular and make clear this element may be repeated. Alternatively, we need clear rules how different entities would be separated. ( ) MCD: Again, the attribute isn't singluar, it is a collection/list of things (in this case Strings), each representing a particular Collection. I can see about clarifying the structure, but your concern is more along the lines of "how do I serialize array parameters in the VOTable".. which is a different concern.
MCD-20150206: Done - similar change as with reference. Added this element to votable serialization example as well. 2.6.4 'DataID.DatasetID': URI The relation to 'Curation.PublisherDID' should be clarified. Also, I'm not sure I'm a big fan of the text on journal-based URIs. I'd much rather have here the text proposed above on PublisherDID. MCD: will update. MCD-20150206: Done 2.6.5 'DataID.CreatorDID': URI Again, I'd like to see a text similar to the one on PublisherDIDs here, except that I should be made clear that the base IVORN would be the one of the creator. MCD: will enhance the description. MCD-20150206: Done 2.6.7 DataID.Version There should be an explanation for how this relates to Curation.Version. MCD-20150206: Not Done - I'm not sure what the distinction is. 2.6.11DataID.ObservationID: String If this is intended to actually be an "internal" id, are there any expectations on the semantics? An example might help. MCD: I don't believe there are any expectations/restrictions on the semantics. 2.7 DataModel I think it would help understanding if it were mentioned here that concrete values are bound in sections 7 and 8. There, I'm confused that Prefix and Type are optional. Are there guidelines when they can/should be left out or be included? If not, I'd suggest to drop them entirely -- if an application cannot rely on them in the first place and they're not necessary for some specific task, why bother at all? MCD: I'll look at the language. This bit has migrated quite a lot during the Cube model discussions, so is a little tricky. In the new Cube docs, this object doesn't even exist. The name MUST be present, and MUST match that specified by the particular model. The prefix MAY be specified, if not given, the default value MUST be used within the serialization. I think that's how it went. User-defined content would provide both to indicate their elements. MCD-20150206: Done - descriptions updated 2.8.1 Derived.SNR: Double Either provide an embedded hyperlink or say "can be obtained from [7]" at the end of the first paragraph. MCD-20150206: Done - embedded the hyperlink. | ||||||||
Deleted: | ||||||||
< < | 2.9.1 ObservingElements | |||||||
Added: | ||||||||
> > | 2.9.1 ObservingElements | |||||||
Typo: ObservingElments ()
MCD: will fix MCD-20150206: Done 2.9.4 DataSource.Name: String Is this SSAP's DataSource? If so, can we harmonise this type with what's in the SSA registry extension for that? MCD: OK MCD-20150206: Done 2.11.1Target.Name: String I'd like to see some prose in here like "If at all possible, this object name should resolve in the domain-specific resolution services, e.g., SIMBAD or NED". MCD: the cube model description does that.. I'll adopt it here. MCD-20150206: Done - updated description. 2.11.4Target.Class: String I think we can't really write things like "an initial deployment of the VO would" in 2014. If we can't agree on a closed vocabulary here, let's at least put in some representative recommended terms. If we can't get any better, then let's at least put in the text for the equivalent field in obscore. ( ) MCD: In the cube work I have: "General classification of the target. This field supports the discovery of data pertaining to a common class, e.g. 'Star', 'Galaxy', 'AGN'. At the time of this writing, there is no IVOA recommended vocabulary for this field. The SIMBAD and NED databases use defined vocabularies for astronomical object classifications which may serve as the basis for such."
MCD-20150206: Done - updated description | ||||||||
Deleted: | ||||||||
< < | 3.1.2 SpectralSI: String (and following) | |||||||
Added: | ||||||||
> > | 3.1.2 SpectralSI: String (and following) | |||||||
Since we how have VOUnits, is it really a good idea not to use it here? ()
MCD: I don't see the conflict. These are an alternative/generalized unit representation of the VOUnit strings.
3.4 CoordSys
I'm concerned about the duplication of responsibilities between this and CoordSys. As I believe in general embedding is preferable to referencing in VOTables and I see little to gain by "normalising" this (so items with common systems can reference instead of repeat): Can't we just agree on sticking this information in CoordSys all the way through? It's as expressive, and it would remove optional elements and, best of all, another source for potentially conflicting information. MCD: I'm afraid I don't understand.. which two CoordSys are you referring to?
I wanted to keep the serialization examples lean, since they really shouldn't be included in the model doc. A full serialization of the model would basically be a reference implementation, and external to the doc, either as a Note or in some location where we keep reference implementations of our standards. 3.6 CorrectionItem For interoperability, I think this should include strict rules on what clients are supposed to do when they encouter CorrectionItems they don't understand (at least if they're marked as "not applied"). MCD-20150206: The requirements about what to do with elements they don't understand would vary depending on what type of client it is. A cutout service may just pass it along, where an analysis tool would need to handle it. So, I don't think that decision is appropriate for the model doc. | ||||||||
Deleted: | ||||||||
< < | 3.9.6 DataAxis.unit: String | |||||||
Added: | ||||||||
> > | 3.9.6 DataAxis.unit: String | |||||||
I'd like to see a "MUST conform to VOUnits" here. ()
MCD: I can add that (to each 'unit' element), section 6.2 does specifically state that the model requires compliance with VOUnit-1.0, but may be worth repeating. MCD-20150206: Done 3.13 SpectralResolution Is there actually a compelling reason to keep both ResolPower.refVal and Resolution, in particular since, as stated in the text, they can be fairly trivially transformed into each other? Having two spots for essentially the same thing is an implementation liability at the very least, and I'd argue for a 2.0 version "backwards compatibility" is not a terribly strong reason. And even "obscore compatibility" doesn't convince me. It would certainly be nice if our data models had more consistency, but as at least until VO-DML is ready it seems we have to choose between intra-model consistency (one place, one form for an item) and inter-model consistency. I, for one, would go for the former any day. MCD-20150206: again, I agree that trimming the fat would be good, but was outside the scope of this revision. I wouldn't want to remove things without a review of the consequences. 4.4 Coverage Here, the document structure appears to have gone funny -- there are first three empty subsections, then three sections that appear to flesh out these subsections. ( ) MCD: The structure is consistent, but the content is weak. Coverage (4.4) consists of 3 elements; Location (4.4.1), Bounds (4.4.2), Support (4.4.3) , in those sections, should be content describing their use in that context (as elements of Coverage).. and I didn't have anything specific to say. They each then have sections (4.5, 4.6, 4.7 respectively) defining the elements themselves. The wonky part of the section structure is that the sections are not in alphabetical order as they are in the other sections, it seemed more confusing to do that.
5.3.2 DopplerDefinition: Enum "Comparisons to these values should not be case sensitive." -- does case-insensitivity help here a lot? In my implementation practice, I've always found case folding to be a noticeable burden and source of errors, while I usually fail to understand how they could be useful. 5.6.3 TimeFrame.Zero: Double I'm not sure I understand what this is intended to do. That may be me, but I'd read this as allowing some global shift in all times in a document, and that I'd find at least in clear need of a strong justification. | ||||||||
Deleted: | ||||||||
< < | 6.1.3 UTypes | |||||||
Added: | ||||||||
> > | 6.1.3 UTypes | |||||||
"These labels are used as synonyms for the CharacterisationAxis portion of the relevant UType,..." -- they certainly are no "synonyms", right? Maybe it should say "specializations" or something like that here? ()
MCD: Will change the wording. MCD-20150206: Done 6.4.2.6 Position The "unit" item here again appears to insinuate multiple units might end up in one string. I would certainly be useful if this said how these would be separated (or otherwise distributed to the fields in question).
MCD: In this doc, the Position object has a singular unit field which contains the unit string applicable to all Cn attributes of the sub-types (Position1, Position2). Basically, it requires all contained values to be given in the same unit. MCD-20150206: Done - enhanced the description 6.4.3.1 stdRefPosition I'll not tire to point out that having this large list of potential reference positions makes it unlikely that any implementation will ever support even a large part of them, which at best may lead to interoperability problems. Can't we say people are supposed to support a small subset of these? There's always TOPOCENTER for Pluto-orbiting observatories. Of course, the topoi in there are to be described somewhere else, but that we'd really need anyway. I'd propose a similar reasoning for stdSpaceRefFrame. MCD: I think document should have the comprehensive list.. (well, actually, stc should). The applications would define which are required.. for example the SSAP protocol could specify that a service need only handle (blah. blah. blah) to be IVOA compliant.
Since we're making a major release anyway: Can't we just drop ABSOLUTE? MCD: I have no strong opinion on this. Char-1.13 doesn't have it, so would be making it more consistent. Unless there is strong objection, I will remove it.
MCD-20150206: Done - noone objected C.1.1 Basic Spectrum Instance As said above a couple of times, I think this would be a good place to say how sequence- or array-like items are to be serialised. MCD: yes it would.. the serializations are 'minimal' and none of those items are required, but it would be useful.
MCD-20150206: Done - I added several items to the Spectrum VOTable example. "an example of the various datatypes ( string, double, Also, there's this in the instance: <GROUP name="Data"> <FIELDref ref="DataFluxValue"/> <FIELDref ref="DataSpectralValue"/> </GROUP> With old-style utypes I'd argue that makes no real sense. The utypes on the FIELDs are enough. If you keep it in, you'd at least have to explain how this is intended to be used and whether that's mandatory or not. ( ) MCD: Are you referring to my serializing the Data group as containing the 2 FIELDref specs rather than let the UType on the field elements show that they are part of the Data element? If so, this is true for all of the GROUPs. The first paragraph in C.1 states, "We use the VOTable GROUPS construct to aid readability. It is not a requirement for users to make use of this construct for all elements of the model." One could repeat the utype in the FIELDref, but that wouldn't (I think) aid readability.
C.2 FITS Serialization I believe as an implementor I would ask how I'd annotate existing spectra that have SPECSYS CMBDIPOL or SOURCE.. MCD: Yeah, I suppose one would. The question applies regardless of VOTable or FITS, these aren't in the STC reference position list, so would be "CUSTOM" frames. Since this model uses a 'simplified' STC model which allows only standard reference position values, I don't think this would be supported. This is the sort of issue that the Cube work should help resolve. | ||||||||
Deleted: | ||||||||
< < | p 100 "Open Issues" | |||||||
Added: | ||||||||
> > | p 100 "Open Issues" | |||||||
I guess having "open issues" in a REC would merit a brief comment on why they're left open and what the cost of that is. Then, there's no utypes specification to date, and frankly, the expectation utypes could magically solve the problem described in the first bullet point is part of the problem -- they can't, really. So, I'd propose to describe the problem without reference to utypes (where it would seem to me that at least in VOTable serialisation an ad-hoc convention would solve the problems). ()
MCD: Both of these are serialization items, I'll rename it "Serialization Issues", and rework the first bullet to recommend using GROUP elements in VOTable to provide the structure. MCD-20150206: Done Summing up, I'd certainly appreciate collecting somewhat more implementation experience with this; however, after the points marked with (*) are addressed in one way or another, I'd not hold up the process and approve. -- MarkusDemleitner - 2014-09-16
Semantics Working Group ( _Norman Gray, Mireille Louys )The Semantics WG had interactions in the past about the way UCD were used in the previous version of this specification. Provided the updates of SpectralDM v2.0 does not touch the semantic aspects , the Semantics WG approves this document. A question still remains , as for other data models, to define a metrics that will help to evaluate how much of a data model has been covered in the reference implementations.
Education Interest Group ( _Massimo Ramella, Sudhanshu Barway )
Time Domain Interest Group ( _John Swinbank, Mike Fitzpatrick )
Data Curation & Preservation Interest Group ( Alberto Accomazzi, Françoise Genova )
Knowledge Discovery in Databases Interest Group ( George Djorgovski )
Theory Interest Group ( _Franck Le Petit, Rick Wagner )The Spectral Data Model v2.0 deals with Theoretical Spectra. It would have been useful that during the process could have talked with the Theory Interest Group. The major concern I see, is that some same kind of data (here spectra) can be published and retrieved through different DM / Access Protocols. For example, we may find some services about Theoretical Spectra described through Spectral Data Model and other ones described through the Simulation Data Model. I think this situation may be problematic to discover data and to develop tools to discover data in an interoperable way. Somebody who wants to retrieve Theoretical Spectra will have to develop two ways to access them either with SDM / SSA and SimDM / SimDAL. Historicaly, Theoretical Spectra have always been described by the Spectral DM v1 but that was by default, because no other DM covered Theoretical Data. From 2012, the IVOA has the Simulation DM, and so, for clarity, we can wonder if it is not time to say that Theoretical Data has to be described using the Simulation Data Model. Standards and Processes Committee ( Françoise Genova )
<!--
|