SODA RFC

This document will act as RFC centre for the SODA http://www.ivoa.net/documents/SODA/index.html .

Review period: 12th October to 24th November 2016

In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your WikiName so authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.

Discussion about any of the comments or responses should be conducted on the DAL WG mailing list, dal@ivoa.net . However, please be sure to enter your initial comments here for full consideration in any future revisions of this document.

Implementations

Links and description of existing interoperable implementations

++ SODA sync service

resourceIdentifier : ivo://cadc.nrc.ca/soda#sync

standardID : ivo://ivoa.net/std/SODA#sync-1.0

accessURL : http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/sync

++ SODA async

resourceIdentifier : ivo://cadc.nrc.ca/soda#async

standardID ivo://ivoa.net/std/SODA#async-1.0

accessURL http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/async"

  • GAVO SODA half client and ... server
see : http://mail.ivoa.net/pipermail/dal/2016-March/007414.html

  • CASDA home page. Restricted access
https://casda.csiro.au/casda_data_access/

  • CASDA validator
https://github.com/csiro-rds/sodalint

Comments from the Community


  • Sample comment (by BrunoRino): ...
    • Response (by authorname): ...

Comments from Pierre Fernique


1) This document is clearly technical and requires the knowledge of several other IVOA documents. The section 4 is a great help to understand the associated relations. But the addition of a dedicated schema showing the SODA interactions with the other IVOA protocols directly connected to SODA (TAP, SIAv2, DATALINK) just after the general usual global IVOA protocol schema could help a little bit more.
This diagram exists in SIAV2.0 We could repeat it or refer to it. Only drawback is that this diagram still uses "AccessData" name instead of SODA. Maybe an erratum is necessary -- FrancoisBonnarel - 2016-11-30
2) Document factoring suggestions:
- The usage of very small subsections (page 6 and follows) describing usecases could be refactored in a list of items and sub-items.
- The multiple repetition of bibliographic reference should be avoid (ex: page 12)
Your two suggestion have been integrated. thanks -- FrancoisBonnarel - 2016-11-30
3) Coherence:
- Some utype uses domain prefix (obscore:) and some others forget de domain prefix (exemple page 15 3.2.7)
- The font for Obscore label is not always homogeneized (ex: 4.3 "content-type", "access_format", "access_url").
The two suggestions have been integrated -- FrancoisBonnarel - 2016-11-30
4) Missing definition:
- The three-Factor semantic should be described a little bit more. It is difficult to see the relation with the way to express the parameter (page 15)
- The "resourceIdentifier" parameter is not described in the document => "If the service is registered, the provider can include a resourceIdentifier parameter" (page 18)
The resourceIdentifier is needed in the case the service is known in the registry. Text has been completed -- FrancoisBonnarel - 2016-11-30
7) Questions/suggestions:
- It appears quite strange to express a MAX value for structured objects (CIRCLE, POLYGON, ...) thanks to the VOTable VALUES element. What MAX mean in such case ? I suppose that the idea is to provide the limit of the query region. In the text, it is said that it should be taken as a suggested values. I am a quite uncomfortable with this VOTable usage.
This part has been strongly modified. Reasons for using MAX for the "englobing" area in the case of CIRCLE and POLYGON has been clarified and the englobing MAX feature has been removed for BAND and TIME (xtype=interval). Nothing more can be done before we have a better way to link PARAMETERS to datamodel structures. -- FrancoisBonnarel - 2016-11-30

- May be, emphasis the fact that the HTTP error code must be properly set (4xx or 5xx)
- Page 22, it is said: "For example, if an {async} job included two CIRCLE and two BAND values, there must be four results" => is it really reasonable to let open this possibility difficult to understand/manage both for the client and the server ? I see this sentence more as an illustration of the lack of boolean constraints on multiple parameters. I would suggest to adopt the AND implicit rule rather to provide all the possible combinations
Humm multiple PARAMETERS is only optional. (see the section above on "Error messages". In case services really want to implement this I don't think they want "AND" -- FrancoisBonnarel - 2016-11-30


8 ) Typos:
p5 - "SODA servces" => "i" missing
p5 - 1.1.2 "descibed" => described
p13 - "of ther data" -> remove "r"
p13 - "SODA Filtering" => F should be in lowercase
p15 - 3.2.7 => simple quote => double quote (utype 'obscore...)
p16 ""Wthin" => "Within"
p16 ""servces" => "services"
page 17 "descrbes" => "describes"
page 18 - "obs_puclishder_did" => "obs_publisher_did"
page 18 - "obs_publishder_did" -> "obs_publisher_did"
page 19 "data-" => "data"
page 22 "nomative" => "normative"
page 22 - "re-organised" => "R" uppercase
page 22 - "patameters" => "parameters"

Typos fixed. thanks. -- FrancoisBonnarel - 2016-11-30

-- PierreFernique - 2016-10-20

Comments from Markus Demleitner

(1) p.8, "The considerations on naming the resource given in sect. 2.1 apply for it." -- "apply to it"? "apply here as well"?

"here as well" looks fine. Fix made -- FrancoisBonnarel - 2016-11-30

(2) p. 10/11 "The values for the ID parameter are generally discovered from data discovery or DataLink requests" -- since DataLink in general does not allow data discovery, I think the "or DataLink" should be deleted here. Even if a Datalink step is between the discovery and the SODA operation (as I believe it will generally be), the data id itself is a product of the discovery phase.

Fix has been made -- FrancoisBonnarel - 2016-11-30

(3) I am uncertain what the purpose of sect. 3.2.7 is, in particular at this point in the document structure (next to the special parameter definitions). If it absolutely must stay in, it should at least be better placed within the document structure, at least as a subsection of its won, or perhaps as an appendix, also adding an indication as to who is the adressee of this information. Or perhaps that's material for an implementation note in the first place?

This has been transformed in a subsection. this statement is as important as the three factor semantics for understanding of what it's going on with SODA -- FrancoisBonnarel - 2016-11-30

(4) What is supposed to happen if someone specifies both CIRCLE and POLYGON, and perhaps POS on top? (my take: it's an error; in my implementation, spatial axes can additionally be constrained by axes-specific keys (RA, PIXEL_1, etc), and there's no way any other semantics could ever be sanely implemented.

(5) 3.2.3 still mentions the region xtype that I think was/will be dropped in DALI; if that's the case it should go here as well (also in the interest of not clobbering future developments in which that xtype might be used for something more principled).

This has been removed -- FrancoisBonnarel - 2016-11-30

(6) 3.2.4 talks about "energy interval(s)", which, while not factually incorrect, is misleading since we are actually specifying is wavelength intervals, so I think we should say that.

Fixed. -- FrancoisBonnarel - 2016-11-30

(7) 3.2.4 also says "barycentric wavelength in meters"; I'm not sure that's not too constraining, but perhaps using topocenter rather than barycenter if that's what your data has wouldn't matter too much. But if we go into this trouble, I'd say we also have to say "This specification does not constrain whether vacuum wavelengths or air wavelengths are given". Choosing either one would alienate certain communities. Next time we do a VO, let's agree to use frequency or energy to constrain the spectral axis, and we won't have that problem.

unchanged. Barycentric makes sense, but in many use cases it will be indifferent. In other cases conversion could be done -- FrancoisBonnarel - 2017-01-09

(8) In the opening paragraphs of sect. 4, there is "This mechanism is expected to be the primary means of finding and using a SODA service." Since it is not quite clear what this refers to and doesn't add normative value, I suggest this should be taken out.

Not done. primary means is opposed to discovery via the registry -- FrancoisBonnarel - 2016-11-30

(9) In the example in the opening material to sect. 4, there are no DESCRIPTION elements to the PARAMs, which I think sets a bad example. Can't we just add some? Or at least say that "\xmlel{DESCRIPTION} elements, while highly recommended in practice, have been left out in this example for brevity"?

This has been fixed -- FrancoisBonnarel - 2016-11-30 -- MarkusDemleitner - 2016-11-23

Comments from MarkTaylor

These comments are on version PR-SODA-1.0-20161201.

  • sec 3.2.3: The text discusses shape keywords e.g. "circle" and "polygon" (lower case) but they are CIRCLE and POLYGON in the table and examples. Are these strings supposed to be case-insensitive? DALI says in general parameter values are case-sensitive unless explicitly noted otherwise, but I don't see such a note here (though maybe I'm not looking hard enough).
Indeed PARAMETER values have to be case sensitive here (in contrast with PARAM Names). Text has been fixed accordingly -- FrancoisBonnarel - 2017-02-07
  • sec 3.2.[345]: There may be issues relating to representation of infinities here - see my comment at DALI1Dot1RFC.
This has been changed -- FrancoisBonnarel - 2017-02-07
  • sec 4.3: As others have noted, I don't much like using array values of MAX/MIN elements to indicate something about bounding geometries, it looks like considerable abuse of the intention from VOTable; even syntactically it looks wrong, since I'd expect the MAX/MIN values to be scalars for comparison with array elements, not full arrays. However, nothing in the VOTable document or schema explicitly forbids it. In general the use of PARAMs with the required value attribute set to the empty string also seems extremely fishy to me, and in fact at present STILTS votlint will signal an error for something like <PARAM datatype="double" arraysize="2" value="" .../> on the grounds that the value does not have the right number of elements.
On the first point we answered elsewhere. ON second point Markus suggests "We interpret the VOTable spec to say that PARAM/@value contains TABLEDATA-encoded literals. Furthermore, since VOTable 1.3, the empty string is legal as a NULL literal in TABLEDATA for all types. While it is true that the @value attribute is required by PARAM's XSD content model, its type is xs:string without any constraint on length. Hence, it does not seem to us that an empty string would be an invalid value. Have we missed something?" -- FrancoisBonnarel - 2017-02-07
  • sec 5.1: Requiring a 204 (No Content) response in some circumstances from the sync endpoint seems to be in contradiction with DALI (your choice of version) section 4: "All DAL service requests eventually result in one of three kinds of responses: successful HTTP status code (200) and a service- and resource-specific representation of the results, an HTTP status code and a standard error document (see below) or a service- and resource-specific error document, or a redirect HTTP status code (302 or 303) with a URL in the HTTP Location header." Am I misreading it?
DALI was fixed accordingly -- FrancoisBonnarel - 2017-02-07
  • sec 6: The different behaviour for sync and async services in the presence of the same parameters (multiple results) sounds a bit confusing to me, but maybe it can be made to work. However, where it discusses combinations of input parameters that don't generate a result, it sounds like the same thing for which a 204 is recommended in sec 5.1, but it says to return a text/plain document with an error message instead - seems a bit inconsistent. At least, it should recommend which error message to use in that case, it's not clear to me that any of the ones listed in table 4 is appropriate here.
Reference to DALI has been added -- FrancoisBonnarel - 2017-02-07
  • sec 3.2.3: Using infinity in the example POS=RANGE 0 360 89 +Inf seems a bit eccentric.
This has been changed -- FrancoisBonnarel - 2017-02-07
  • sec 2.1: "will in general be provided the access URLs.." - missing a word?
added a "with" -- FrancoisBonnarel - 2017-01-06

  • sec 3.2.3: "if the differ in protocol" -> "if they differ in protocol"
fixed -- FrancoisBonnarel - 2017-01-06

-- MarkTaylor -

Comments from TCG members during the TCG Review Period: 2016-10-12 to 2016-11-24

WG chairs or vice chairs must read the Document, provide comments if any and formally indicate if they approve or do not approve of the Standard.

IG chairs or vice chairs are also encouraged to do the same, although their inputs are not compulsory.

TCG Chair & Vice Chair ( Matthew Graham, Pat Dowler )

Applications Working Group ( Pierre Fernique, Tom Donaldson )

Approved -- PierreFernique - 2017-05-10

Data Access Layer Working Group ( François Bonnarel, Marco Molinaro )

Data Model Working Group ( Mark Cresitello-Dittmar, Laurent Michel )

I think there is no need to duplicate the reference each time a standard is mentionned.

This has been fixed -- FrancoisBonnarel - 2017-02-07

There are lots of references to SODA 1.1 spread out in the text, including in normative section. This gives the impression that the standard is not complete. It would be better to have "prospect" section somewhere gathering all of these missing features (3, 3.2.3, 3.2.4, 4.3).

It is difficult to speak of these future features outside the context which explains the meaning. they are now set in footnotes. Hopes this helps -- FrancoisBonnarel - 2017-02-07

- 1.1: the different ways to access a SODA are unumerated into the text and a again in subsections 1.1.1 and 1.1.2. 1.1 could be shorten and the sections 1.1.x enhanced.

_ 1.1 is a summary of everything which is reused from IVOA standards_ -- FrancoisBonnarel - 2017-02-07

- 1.1.1 could be ignored like proposed by Markus, or at least, the limits of the SODA registration should be clearly stated: no parameter description, no ressource giving the datasets IDs consumable by that SODA.

not done (see below) -- FrancoisBonnarel - 2017-02-07

- 1.1.2: I would prefer not to mention Datalink (.. service descriptor) in that section to avoid confusion. Talking about "service descriptors" instead of "Datalink service descriptors" would be enough since this point is clarified later

-Done_ -- FrancoisBonnarel - 2017-02-07

- 1.2: There are two references to section 1.2 within section 1.2.

Fixed -- FrancoisBonnarel - 2017-01-09

- 1.2: There is a full page of use-cases not covered by the present standard: better to be moved to the "prospect" section

There is now a subsub section for future use cases -- FrancoisBonnarel - 2017-02-07

- 2: Not clear whether {sync} is compulsary: it seems to be a the section start and no longer below the table

One resource among {sync} or {async} is required. This has been clarified in the table -- FrancoisBonnarel - 2017-01-09

- 2.1: Something looks wrong with the syntax in "This to allow......user input"

Fixed -- FrancoisBonnarel - 2017-01-09

- 2.4/5: References to a working group (GWS) not relevant. Should refer to a standard or be more explicit

The Specification is VOSI. In parenthesis appear the author and year of publication of this spec. the trick is that in tat case GWS WG is the author) -- FrancoisBonnarel - 2017-01-09

- 3 The introduction (between 3 and 3.1) is focused on multiple parameter which is not the main topic. Better to have first a more general introduction (or no introduiction at all) and a subsection about multiple parameters.

A new subsection has been created for parameter multiplicity -- FrancoisBonnarel - 2017-01-09

- 3 Specifying somewhere that multiple parameters means multiple instances of one parameter (POS=,POS=...) but not multiple values of one parameter (POS=x,y,z,...)

Done -- FrancoisBonnarel - 2017-01-09

- 3.2 It would be better to start the filtering parameters list with POS (actual 3.2.3) since 3.2.1 and 3.2.2 mention it

Done -- FrancoisBonnarel - 2017-01-09

- 3.2.1 Useful to remind radius unit

It was done in POS, and as now according to your suggestion POS is positionned before CIRCLE so think it's obvious it will be the same unit for it -- FrancoisBonnarel - 2017-01-09

- 3 vs 3.2.6: according to 3 multiple parameters are optional but must be supported for POL according to 3.2.6

The special case of POL has been made explicit -- FrancoisBonnarel - 2017-01-09

- 3.3 a table showing the correspondance between SODA params and OBSCORE fields would be more explicit

Done -- FrancoisBonnarel - 2017-01-09

- 3.4 end of 1st parag: What is the problem when 2 providers use the same parameter?

- 3.4 There is reference to a Wiki page which content is mostly copied into the text: better to remove the ref and to revamp the text.

- The reference has been removed_ -- FrancoisBonnarel - 2017-02-07

- 4.1 The purpose the te SODA registration is definitely not to run the service, because this is not possible without knowledge on the IDs. It is rather of discover VO resources implementing a SODA. This distinction in term of use cases should be made here.

In my opinion the text on registration should be kept. If the service is using Ivoids for the ID parameter it's supposed to be potentially working for some ids. Use cases are when you discover your SODA service outside a standard DAL discovery service -- FrancoisBonnarel - 2017-01-09 - 4.3 1st sentence not clear (to me)

I talked with you about that one. If nobody else complains I let it as it is -- FrancoisBonnarel - 2017-01-09

Having said that, DM approves.

-- LaurentMichel - 2016-12-06

Grid & Web Services Working Group ( Brian Major, Giuliano Taffoni )

  • architecture diagram link is broken in the HTML version
  • secion 1.2: There is a bullet that says "Anticipated future Use Cases". If this is supposed to be a title for the bullets following that should be made more clear by indenting or some other formatting. If it is not intended to be a title I would suggest removing the bullet altogether.
Good point. Fixed -- FrancoisBonnarel - 2017-01-06

  • In table 1, There no values in the 'required' column for {sync} and {async}. I see that it is explained in the paragraph below, but perhaps the 'required' column isn't of much value as they are only filled in for the resources defined elsewhere.
required has been filled for sync and async instead -- FrancoisBonnarel - 2017-01-06

  • Section 2.1 sentence "This is to allow clients, given the access URL, can reliably find out the URL of the capabilities endpoint." reads awkwardly. Perhaps just changing "can" to "to" is enough.
Fixed -- FrancoisBonnarel - 2017-01-06

  • I know there has been much discussion in the mailing lists regarding the syntax of the filtering parameters in this document, and I haven't been playing close attention to the details of those conversations, but to me, the fact that there are two ways of filtering with circles and polygons (CIRCLE=, POS=CIRCLE) seems like something went wrong. Perhaps there is a history I am not aware of. The explanation of "type-safe serialized value and unit metadata" doesn't seem sufficient to me.
The "POS=CIRCLE..." syntax is there for consistency with SIA2. It is unifying the spatial axis constraints and is well adapted to the enduser . The other one allows xtypes to be defined : more adapted for automatic treatment. -- FrancoisBonnarel - 2017-01-06

  • Section 5.2 should include the HTTP response code associated with each error.
If the above can be addressed by changes or by explanation/clarification then I approve this document.

Reference to DALI has been made. It should be the same -- FrancoisBonnarel - 2017-02-07

-- BrianMajor - 2016-12-09

I approve this document.

-- BrianMajor - 2017-03-29

Registry Working Group ( Markus Demleitner, Theresa Dower )

Though this may be odd coming from Registry, and though I'm usually totally in favour of standards regulating their Registry aspects, in this particular case my feeling is that the standard has very little relationship to the Registry, and it would benefit from removing most of the references to it. This would mean:

(1) p.5, "First, a SODA service could be found in the IVOA Registry and used directly." -- since it is impossible to know a dataset identifier (or the parameters supported by a SODA service) in this scenario, and even if (e.g., from an examples document) you had that it's hard to imagine what useful purpose could be served by such a thing, I think this sentence should be taken out.

(2) p.5, "Since the discovery of SODA services ... make use of a registry extension" -- I'd strike that sentence, too. The consequence implied here is tenuous at best (parameters are actually declared within the interface, whereas capability metadata will typically cover things that aren't in the Datalink descriptor), and we don't have to apologise for not defining a registry extension anyway. This also lets us drop 1.1.1, together with the "see also" to 4.1, which in turns largely repeats 1.1.1 only adding that it "is not expected to be a common usage pattern". So, 4.1 could go, too.

Since the sections are not wrong as such, I'm not insisting, but the document certainly would be shorter and clearer without them.

Actually, I can see two use cases for having SODA services registred:

(a) Users might at some point search for "resources with associated SODA service"; the scenario here could be that a user has located a service but wants a SODA-enabled version of it. How likely that is I can't say.

(b) Infrastructure might look for SODA services to validate. Together with an examples page that might even be possible.

In both cases I think the story isn't terribly strong, so I'm not suggesting even mentioning either in the document.

In my opinion the text on registration should be kept. If the service is using Ivoids for the ID parameter it's supposed to be potentialy working for some ids. Use cases are when you discover your SODA service outside a standard DAL discovery service -- FrancoisBonnarel - 2017-02-07 Having said that, Registry approves.

Oh, and we've also put SODA.vor into the repo. That's a StandardsRegExt record for SODA. When you go to REC, please update the date and submit it to the RofR as described in WriteAStandardsRecord.

-- MarkusDemleitner - 2016-11-23

Semantics Working Group ( Mireille Louys, Alberto Accomazzi )

The document (version PR 2017-02-07) is well structured and describes all parameters in details.
However, here are the few changes I would appreciate to see happening in the document before approving it.

  • Use UCD updated terms:
Obscore 1.1 introduced pos.outline;obs.field for s_region to describe the limits of an observed field, so this matches CIRCLE and POLYGON cases.
BAND can use the new stat.interval ucd defining a pair of values and combine in ucd='em.wl;stat.interval'
These changes are in the last WD UCD-list http://wiki.ivoa.net/internal/IVOA/IvoaSemantics/WD-UCDlist-1.3-20160719.pdf published in the working group and which will be published soon for RFC. Apologies for the delay in the UCD update process.

Changes have been made. Consistency with Obscore is a major driver.Thanks -- FrancoisBonnarel - 2017-03-12

  • p.6 "Flatten" a data cube into a 1D spectrum could be changed in 'Extract a representative 1D spectrum (average, psf convolved, ...) from a spectral data cube' or something similar.
Flatten has been replaced by reduced -- FrancoisBonnarel - 2017-03-12

  • Figure 1: update the architecture diagram to the valid version
Done -- FrancoisBonnarel - 2017-03-12

  • Section 2- Table 1 remove 'no' in the 'required' column on lines labeled sync and async because one among the two at least is required.
Done -- FrancoisBonnarel - 2017-03-12

  • Section 3.4 Filtering parameters ....
The limits in spatial, spectral and temporal domains are clearly defined.
If the requested limits exceed what can be retrieved for some candidates data sets, what is the current behavior of such a cut-out service ?
- send an error informing coverage mismatch ?
- send the chunk compatible from the dataset, but also a warning to inform that the response coverage is smaller that the expected?
- return existing data and let the user discover that the data size is smaller than expected?

The user should know how to detect and handle partial query responses.

SODA works on a best match to request approach. Larger sizes than available will be reduced. This has been explained in section 3.3 -- FrancoisBonnarel - 2017-03-12

-- MireilleLouys - 2017-03-06

Education Interest Group ( Massimo Ramella, Sudhanshu Barway )

Time Domain Interest Group ( John Swinbank, Dave Morris )

Data Curation & Preservation Interest Group ( Françoise Genova )

Operations Interest Group ( Tom McGlynn, Mark Taylor )

Knowledge Discovery Interest Group ( Kaï Polsterer )

Theory Interest Group ( Carlos Rodrigo )

Standards and Processes Committee ( Françoise Genova)

TCG Vote

TCG Chair & Vice Chair ( Matthew Graham, Pat Dowler )

Applications Working Group ( Pierre Fernique, Tom Donaldson )

Data Access Layer Working Group ( François Bonnarel, Marco Molinaro )

Approved -- FrancoisBonnarel - 2017-04-12

Data Model Working Group ( Mark Cresitello-Dittmar, Laurent Michel )

Approved -- LaurentMichel - 2017-05-11

Grid & Web Services Working Group ( Brian Major, Giuliano Taffoni )

Approved -- BrianMajor - 2017-04-28

Registry Working Group ( Markus Demleitner, Theresa Dower )

Approved -- MarkusDemleitner - 2017-05-02

Semantics Working Group ( Mireille Louys, Alberto Accomazzi )

I approve the document. -- MireilleLouys - 2017-05-10

Education Interest Group ( Massimo Ramella, Sudhanshu Barway )

Data Curation & Preservation Interest Group ( Francoise Genova )

Knowledge Discovery in Databases Interest Group ( Kai Polsterer )

Operations Interest Group ( Tom McGlynn, Mark Taylor )

Theory Interest Group ( Carlos Rodrigo )

Time Domain Interest Group ( John Swinbank, Dave Morris )

Standards and Processes Committee ( Françoise Genova )



Topic revision: r25 - 2017-05-11 - LaurentMichel
 
This site is powered by the TWiki collaboration platformCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback