|
My primary concern with this standard is regarding the specification of temporal (and to some extent spatial) coverage. With the RFC period for MOC 2.0 now imminent, we will soon have a more robust IVOA standard mechanism for specifying temporal coverage (TMOCs), and indeed for combined spatial/temporal coverage (STMOCs).
It seems clear to me that the goal should be to use MOCs for both temporal and spatial coverage in VODataService. The question is how to reach that goal. Having this interim VODataService standard that uses simple time intervals does provide a path forward that is simpler than taking the extra time to integrate MOC 2.0. However, since MOC 2.0 already has solid code support, implementing simple time ranges for VODataService could end up being an unnecessary diversion from a straighter path to the end goal. However, I'm worried that if there is any take up of VODataService 1.2 time ranges that it could delay implementations with TMOCs or STMOCs, and hinder interoperability once those implementations are created.
I really don't want to delay this standard any longer, and I'm generally in favor of working towards more frequent updates to improve standards instead of taking forever to create one perfect standard. So I'm wondering, does it make sense to work out some brief but fairly specific language on how we will work towards the goal of using TMOCs and STMOCs, particularly addressing how such changes for 1.3 could be both backward compatible and interoperable?
-- TomDonaldson - 2021-10-05
- Hm. While as you said this is a 1.3 discussion, let me already state here why I think the plain intervals are fine for spectrum and time for at least the next few years. First, nobody expects that discovery in the Registry has no false positives. In fact, the Registry is even a win if there's 99% false positives: 1 service matching, 99 services returning nothing with 2000 services not queried is a fine gain as long as a machine does the querying (which is what the VO is mainly about). Then, given that no database so far has support for TMOCs (and SMOCs only made it into upstream pgsphere this year, where there's no implementation of them in sqlserver, the second platform for existing RegTAP services, yet), going for TMOCs in VODataService would be a large implementation hurdle. On the other hand, for most data collections, you have perhaps a few observation campaigns (which would be the intervals), but resolving those into individual days almost certainly buys you nothing for resource discovery, at least not unless you correlate that with the spatial coverage. Which to me means that unless we go all the way for STEMOCs one day (when there's solid software support for them, preferably including indexed database searches) we'd be paying a high price in implementability for almost no gain in functionality. -- MarkusDemleitner - 2021-10-05
- You make a good point that the 1.2 features are a useful step forward for data discovery, likely pruning the search space significantly. Indeed any implementation could be a useful testbed for informing the evolution of these features and whether the limitations are important. Since the time and spectrum intervals are relatively simple, implementations may be a useful first step towards handling more complex implementations. I will be interested to watch the implementation evolve for sqlserver-based regtap. My intuition is that handling the spatial constraints within sqlserver will not necessarily be the best way to go, and that some extension of MOC server may be an effective pre or post filter to the regtap sql queries. We'll see. On the interoperability question if we want to support STMOC, I suppose this would end up being another "support both" scenario which is somewhat disappointing. Though maybe it's not such a big (or immediate) concern since given the inevitable lag time in populating this metadata for a substantial number of resources, it may take a while to really see how these features are being used and how to evolve. Apps approves. -- TomDonaldson - 2021-10-06
Overall we are happy with this update and it is good to see the split into spatial, spectral and temporal coverage and the use of MOCs
Issues/Concerns
- 2. In the example, change the `xmlns:vs` value to point toward the XSD of VODataService 1.2. Currently it points toward 1.1 which does not include some node elements used in the example, such as `spatial`, `temporal`, ...
- No, the v1.1 is regrettably correct. The minor version should never have made it into the namespace URIs, but for now we must keep them in there or we break everything and need a new major version (cf. the schema versioning EN and the explanation in sect. 2.1). -- MarkusDemleitner - 2021-09-14
- 3.1 From a data provider perspective, after reading 3.1 I'm somewhat confused as to how I should describe the CASDA services. As we have a table of fits images and catalogues stored as tables and accessible via TAP and SIA2/SODA I am guessing these should be vs:CatalogResource ?
- No, it would likely be CatalogService; I've added a back reference to 2.2.2 in the opening paragraph of the section. Does that help? -- MarkusDemleitner - 2021-09-14
- 3.2 https://www.ivoa.net/rdf/messenger/2020-08-26/messenger.html seems to be missing any gravitational wave messengers - should these be included now?
- I don't think so. Our vocabularies are designed to it's easy to add things as they're needed, and so far nobody has asked for Gravitation. And, for starters, I don't want to make the call between graviton (would be a particle in parallel to the others, but who knows it it exists) or gravitational-wave (is what we're actually detecting, but it doesn't quite fit with the others, and at least will need some convention for spectral limits). Things like these tend to be a lot less arbitrary when you have the domain experts on board. -- MarkusDemleitner - 2021-09-14
- 3.2 spectral - joules are required for spectral limits but there are no joules mentioned in the messenger descriptions- not required but it would be useful to provide a translation
- I'll put them in if there's more requests for that, but frankly I think the translation from whatever representation people prefer in the respective bands to particle energy is going to be performed "during serialisation". I do not expect humands in any spectral band to actually handle Joule ever (though it would be cool if we convered to aJ in the optical). The messenger descriptions are aimed at humans, not machines, and hence they use the conventional units per band. But as I said: poke me again and I'll put in the particle energies. -- MarkusDemleitner - 2021-09-14
- 3.2 stc:STCResourceProfile: "This element is imported from another schema" can you refer to the schema?
- Regrettably, not easily. You see, this text is produced through an XSLT (ivoatex/schemadoc.xslt, around line 70). To get to the schema location, I'd have to figure out the prefix mapping (and, for full generality, perhaps even the schema locations). I doubt I can do that from within xslt1 without a lot of apparatus. But I've added some prose above explitily mentioning the schema in Volute rev. 6030 -- MarkusDemleitner - 2021-09-14
- 3.2 vs:SpatialCoverage Attributes: is the "an external standard" in the comment still needed?
- 3.2 Where are vs:FloatInterval and vs:ServiceReference defined?
- Good point; they've been missing in the text so far. I've added them in volute rev. 6031 -- MarkusDemleitner - 2021-09-14
- The description of frame seems a bit out of place now - would it be better above this new block for the types? -- JamesDempsey - 2021-09-15
- I agree it would, and hence I've done the move in volute rev. 6036. -- MarkusDemleitner - 2021-09-20
- 3.3 vs:TableSet, vs:TableSchema, vs:Table, vs:ForeignKey (etc) Type Schema Definitions - recommend either all or none of the elements (e.g. `title`, `description`, …) specify the default of maxOccurs="1"
- Fair enough. In in Volute rev. 6032. I've left some minOccurs="1", as I think they're a useful cue that something is indeed mandatory. You could say that cue should then be verywhere. And I'd have to oblige. But I'll try first to get away with what we have now. -- MarkusDemleitner - 2021-09-14
- 3.3.3 What is “an XML type legally extended from the types associated with these elements”? - an example might help here
- Frankly, this is not my text, and I can't really say if the "legally" was intended to carry actual meaning. And very frankly: before conjuring up an example for something that nobody in the past 10 years bothered to do (all extensions, used trick (2)), I'd rather drop all of 3.3.3 (which is somewhat misplaced in the spec anyway) -- MarkusDemleitner - 2021-09-14
- 3.3.3 Is the namespace changing from http://www.ivoa.net/xml/VODataService/v1.1 ?
- 3.5 - Std defaults to true in inputParam but not in tableParam - is that intended?
- Interesting. Again, this is inherited material, and so I can't really be sure. The std="false" in tableParam certainly makes a lot of sense as most tables do not follow a standard. Having std true in inputParam by default is, ummm, daring. Now that you told me, I suppose I'll have to do something about this in DaCHS. Be that as it may, while I don't think anyone does anything with inputParam/@std, I'd be very cautious touching this -- there are more than 25000 VODataService records out there, and even just changing a default for a forgotten attribute may blow a few of them up. I'd need a stronger reason for risking that than just internal symmetry. -- MarkusDemleitner - 2021-09-14
- 3.5.2 vs:TableParam Extension Metadata Elements : “See the specification document for definitions of recognized keywords.” => which specification?
- Good catch; these definitions used to be in the schema in ways that did not port to TeX, and I forgot to put them in the text while porting the document. I've put them back into the schema in a somewhat terse way in Volute rev. 6032 -- MarkusDemleitner - 2021-09-14
- Appendix A - adding sections affected by each change would help review and use by implementers
- Did that for the edits since REC-1.1 in volute rev. 6033.
Edits:
- 3.1.2 typo - “instead of vs:DataResource when the resource’s capabilties” => “capabilities”
- 3.1.3 suggest a comma after coverage in "has metadata on coverage and the facilities"
- 3.1.3 remove “of” in "SHOULD declare of the metadata"
- 3.5 - extendedType - add “the” to “attempt to handle value”
-- JamesDempsey - 2021-09-10
A build with all the fixes in is at http://docs.g-vo.org/VODataService.pdf. Thanks! -- MarkusDemleitner - 2021-09-14
Thanks for the fixes and addressing the concerns above. I agree leaving inclusion of gravitational wave support until we have a domain expert involved makes sense. -- JamesDempsey - 2021-09-15
Being not a great user of such a model, my comments are limited to point out one typo:
- section 2.2.2 page 13: That a data collection ist published through
Approved
-- LaurentMichel - 2021-09-15
Fixed in volute rev. 5949. Thanks! -- MarkusDemleitner - 2021-09-20
GWS is not really involved in VODataService types but I agree on the standard and I approve it. The document is well written but there are still some typos, e.g. capabilties insted of capabilities, collection ist as mentioned by DM WG.
Approved
-- GiulianoTaffoni - 2021-10-05
Approved with the hope of adding STMOC in 1.3 given sufficient takeup. I am specifically not digging into the VOSI and TAPTypes bit and will take the lead from DAL there.
-- TheresaDower - 2021-08-20
This document refers to the messanger vocabulary (http://www.ivoa.net/rdf/messenger), which is still not approuved by the IVOA. Fix this dictionary before the next step toward the REC procedure.
- Which is to mean: review this vocabulary as part of this RFC, as accepting VODataService 1.2 will also make messenger an IVOA vocabulary. -- MarkusDemleitner - 2021-05-06
I have a few minor questions that might simply be typos, or me not having the schema details at my fingertips:
- Page 21, the "Type" patterns for elements temporal and spectral allow negative values. Is that intentional?
- No. There are just generic float patterns. I'm fairly neutral on whether to specialise them to "non-negative floating point", but I'd need some extra encouragement to do that since I don't think there's a large benefit to catching this kind of problem in the schema, where there's always the possibility one day someone has reason to have something negative here (fun fact: about 15% of the parallaxes in Gaia eDR3 are negative...) -- MarkusDemleitner - 2021-09-20
- Page 22, Element regionOfRegard, Type: This is allowed to be negative. Is that appropriate, given the definition?
- Probably not. But then just saying it's a float lets us use an XSD simple type, whereas there is no XSD simple type positiveFloat or something like this. Be that as it may: this is stuff inherited from VODataService 1.1, and I don't think any client uses it; there are also only eight resources declaring it (all mine, btw). So, if I were to touch this at all, I'd deprecate it as not useful at this point. -- MarkusDemleitner - 2021-09-20
- Page 35, arraysize, Type: The arraysize string, according to this pattern, can contain the number '0' and nothing else, in addition to being empty. The interpretation of '0' is not stated, as far as I can see. * Interesting point. Again, this is legacy from VODataService 1.1, and one may rightfully ask why the authors back then felt the need to constrain arraysize more than VOTable itself (which in the schema just says arraysize is xs:string). I'd again say that the few errors we might catch by being more restrictive with the patterns probably don't justify the extra complexity in the pattern – and the curses we'd get from people that actually have zero-length arrays for some reason or other. -- MarkusDemleitner - 2021-09-20
- Page 41, A.3, last bullet: Regarding @arraysize, this bullet says "Use empty arraysize for scalars now." That should be stated in the body of the document, not in an appendix. As far I could tell, it is not. * It's in the schema's annotation for the arraysize attribute, taken over into the generated documentation on (current) p. 44 (“Leave arraysize empty for scalar values”). -- MarkusDemleitner - 2021-09-20
- Page 42, first bullet on page: This item states "BaseParam's delim attribute no longer defaults to blank." From what I can see in the code supplied in this document, BaseParam does not have a @delim attribute. I did not analyse the XSD or compare it to any of the code samples in this document. Does that need to be done systematically?
- Oops, yes, that's an attribute of DataType, which then is a child of all kinds of xParams. Fixed that in the changelog entries. As to auto-validating samples: the ipac-resource.xml that's shown at the beginning I indeed XSD-validate occasionally, and the schema file get exercised as part of DaCHS. Going beyond that in this document is probably hard, as I suppose it would involve processing of natural language. Do you have any specific changes in mind? -- MarkusDemleitner - 2021-09-20
I also note a few typos that I don't think have been mentioned elsewhere:
- Page 14, paragraph that begins "With the advent of services...": The comma should be deleted from the "–," pair.
- Um, I don't think so – I don't have a proper reference to a style guide now, but I'm fairly certain the presence of a parenthetical construction doesn't change the need for a comma. Or am I confusing German and English here? -- MarkusDemleitner - 2021-09-20
- That punctuation looks weird to me too. However, I'd say it's OK to leave this stylistic stuff to author preference, since there's little danger of ambiguity, and there are clearly different norms of how to do these things, even between English and US English, so effort arguing about it or making changes is probably not well spent. -- MarkTaylor - 2021-09-27
- Page 14, paragraph that begins "The solution eventually adopted...": "for these problems are auxiliary" should be "for these problems is auxiliary" (the subject is "solution")
- Page 22, Element waveband, Comment: "...when it is now also coverage non-electromagnetic messengers." I suggest this should be "...when it now also covers non-electromagnetic messengers." or something similar.
- Page 31, paragraph that begins "Normally, a VOResource...": "with capability element" should be "with a capability element"; "to specific URI" should be "to a specific URI".
- Page 31, paragraph that beings "Listing required parameters...": Either the comma following "allowed" should be deleted or a comma should be added after "interface".
Irrespective of the answers to my questions and assuming minor editing as indicated here and elsewhere - Accepted. -- AnneRaugh - 2021-09-17
Fixed everything I've not commented to the contrary in volute rev. 6037 (also visibile on the pre-built PDF). Thanks a lot! -- MarkusDemleitner - 2021-09-20
Changes from this review in Volute rev. 5949 -- MarkusDemleitner - 2021-05-06
This mostly looks good, but a few comments:
- Sec 3.3: The
nrows attribute gives the approximate number of rows. I'm quite happy that that's a reasonable thing to supply, but I wonder if the name "nrows" might be confusing, and conceivably clash with a future requirement for recording the exact number of rows. I would suggest at least that the nrows Meaning entry should read "The approximate size of the table in rows" rather than "The size of the table in rows". And maybe rename the attribute nrows_approx or similar? Not sure.
- I've added the "approximate" to the meaning. On indicating the nature of nrows in the attribute name: I think that's overdoing things. You see, in a transactional system, "number of rows" is a tricky concept (which, by the way, is the reason why a "select count(*)" generally is slow). True, most of our tables are static, and then it won't matter. But given the complication in general I consider it highly unlikely that we'll have nrows-exact one of these days (and what would it be good for over nrows-approx?) -- MarkusDemleitner - 2021-05-06
- It's user-facing information, and users without a transactional theory background might think that "
nrows " means, well, number of rows. Why care? If I'm looking at a data release I'm very familiar with (e.g. Gaia EDR3) and find a declared number of rows different from the public value, I might suspect problems with the ingestion. On the other hand, if its 1470000000 I can probably guess it's a guess. I think it's OK as now with "approximate" in the documentation, but I wouldn't be surprised to get the odd confused user reaction in the future. -- MarkTaylor - 2021-05-07
- Sec 3.1.3: The Comment for element
tableset still says "Each schema name and each table name must be unique within this tableset." My understanding from the change log and (I think?) the CatalogResource type definition is that this is no longer exactly the case; table names must be unique only within their host schema.
- Right. Changed the Comment to "Each schema name must be unique within a tableset." -- MarkusDemleitner - 2021-05-06
- That's not quite the text I was referring to: see line 680 of the XSD (
CatalogResource definition) not line 126 ( DataCollection definition). -- MarkTaylor - 2021-05-07
- Ah, dang. That text clearly shouldn't have been in the attributes but rather in the TableSet type. I've fixed it at the other location, too. Thanks for not being lulled by my assurances. -- MarkusDemleitner - 2021-05-10
- Sec 3.2: "By default, the MOCs are to be interpreted in the ICRS. Future extensions to non-celestial frames (e.g., on planet surfaces) will use the frame attribute." This is reasonable, but MOC currently says things like "the SMOC must be based on the HEALPix tessellation of the sphere, expressed in the ICRS coordinate reference system", which apparently excludes non-celestial use of MOCs. I have lobbied the MOC authors to relax this language a bit.
- Also: what about STMOC? Does the approach to STC coverage need to be reviewed in the light of the introduction of space-time MOCs in MOC 2.0?
- Given that there is no database support for STMOCs in sight, this would delay this quite a bit, and given that time coverage has only be sparingly declared so far, it would seem that correlated space/time coverage would be of low priority. Taken together: If we want STMOC, let's do it in VODataService 1.3 – and until then, let's get time coverage into the Registry at all. -- MarkusDemleitner - 2021-05-06
- Sec 3.5.3: The
delim attribute is defined as '<xs:attribute name="delim" type="xs:string" default=" "/>'. Should that be 'fixed=" "' instead?
- No, I don't think so. For one, this might break records (I don't think it will, as delim, as far as I can make out, is one of those "let's add it just in case" features that then nobody uses and are just trouble). And then, outside of VOTable, delim might one day come in handy after all ("just in case"). -- MarkusDemleitner - 2021-05-06
- Sec 2.2.2: "ist published" -> "is published" * Entschuldigung -- MarkusDemleitner - 2021-05-06
-- MarkTaylor - 2021-03-12
Thanks for responses, I'm happy with it now. Recommend Accept. -- MarkTaylor - 2021-05-10
Late addition for some UCD-related things I've just noticed: |