Datalink1RFC < IVOA

TWiki>

IVOA Web>IvoaDAL>Datalink1RFC (2015-06-12, FrancoisBonnarel)

EditAttach

Datalink v1.0 Proposed Recommendation: Request for Comments

Public discussion page for the IVOA Datalink Proposed Recommendation.

Latest version of the IVOA Datalink can be found at:

http://www.ivoa.net/documents/DataLink/20150414/index.html

Reference Interoperable Implementations

(Indicate here the links to at least two Reference Interoperable Implementations)

CADC Implementation

The CADC implementation of DataLink provides one or more downloads per input ID an provides links to prototype AccessData services for most science data. Example invocation:

http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=caom:IRIS/f212h000/IRAS-25um

The DataLink service descriptor resource is included in VOTables from DataLink, TAP, and SIA services whenever an appropriate identifier column is included in the output. In the output from example above, there is a service descriptor for the prototype AccessData services (currently custom, so there is no standardID).

Update: The CADC DataLink service has been updated to the post-RFC period PR: uses the core vocabulary in the semantics field and never returns an access_url and service_def in the same row.

For SIA and TAP requests, the service descriptor tells the caller how to invoke the associated DataLink service itself. Example invocation of SIAv2 query:

http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/sia/v2query?MAXREC=1&POS=Circle+180+5+0.2

The service descriptor is the second resource in the VOTable.

DaCHS

GAVO's server package DaCHS contains an implementation of Datalink both in auxiliary resources for SSAP and SIAP and standalone. The Datalink cores section of the reference manual discusses how operators define datalink behaviour (this includes the definition of data processing services and hence is longer than it would need for datalink proper)..

At the Heidelberg data center, several services use these facilities, among them ivo://org.gavo.dc/califa/q2/s, ivo://org.gavo.dc/feros/q/ssa, ivo://org.gavo.dc/mlqso/q/s, and ivo://org.gavo.dc/theossa/q/ssa. There are also pointers to datalink documents in the obscore table (try something like select * from ivoa.obscore where access_format like '%datalink%' on ivo://org.gavo.dc/tap).

SPLAT

Recent versions of SPLAT, a spectral analysis tool, uses datalink to discover how to do server-side manipulations of spectra it retrieves. You can try this over SSAP on, e.g., the TheoSSA, califa ssa, and Flash/Heros services. Or also get the access information to download a spectrum from a Datalink table, which can be tested on the ObsCore results from the CADC service. -- MargaridaCastroNeves - 2014-11-17

TapHandle

TapHandle developped at SSC XMM in Strasbourg ( http://saada.unistra.fr/taphandle ) is now a DataLink compatible client (2015, June 10th release).

Implementations Validators

(If any, indicate here the links to Implementations Validators)

A validator is being developed by VO Paris.

RFC Review Period: 2014-07-04 - 2014-09-01

Comments from the IVOA Community during RFC period: 2014-07-04 - 2014-09-01

NOTE: I will use bullets and italics for the official responses (from the editor, pending agreement of the authors so some might change in the next few days)

like this!! -- PatrickDowler - 2014-09-17

Comment (2014 July 14th) by JoseEnriqueRuiz

All along the doc: several [ref] to fix.

page 1. http://www.ivoa.net/documents/DataLink/20140228/index.html is duplicated

page 5. 2nd p. I would stress on the fact that the service descriptor resource describes how to query a service. It does not describe in detail, for example, what the service returns.

Page 7. 3rd p. At the end of the paragraph: "[The s]"

Page 7. 4th p. Is this actually the same use case of 1.2.1?

Page 8. 2nd p. At the end of the paragraph: "Providers should be able to describe [...]"

Page 11 Remove block describing param REQUEST, since it is no longer required.

Page 15. 2nd paragraph "This resource [is] typically describes.."

These typos are fixed in PR-DALI-1.0-20140925 -- PatrickDowler - 2014-09-17

Page 15. 3.2.4 service_def I would use the value in <PARAM name=”accessURL” to call the service instead of the one present in field access_url provided by the DataLink VOTable response. Why keeping two potentially different values of accessURL? Maybe I'm missing or misunderstanding something that's not clearly explained..

Author Response(2014 July 17th) by MarkusDemleitner

The problem is that there are two usages for the service descriptors:

(a) as part of a datalink response, where there is, as you say, access_url in the datalink table as for any other data link;

(b) as part of a DAL response (say, a SIAP table), where you say "go here for postprocessed (cutout, resampled...) data" -- that's the thing with the PARAM name="ID" ref="". In this case, no external access URL is available and hence the GROUP must contain it.

One could stipulate that service descriptors within datalink documents have no accessURL PARAM and the others do, but I'd say that's an implementation complication that's not really warranted. ->

-->

Answer (2014 July 21st) by JoseEnriqueRuiz

Ok, but what to do in the potential case of having different values for access_url field and accessURL PARAM? This case is explicetely described in the docs, and as I understand it the solution proposed is to use access_url value to call the service (or at least it is not clear enough for me) If this is true, I would prefer to use the value given in accessURL PARAM instead, as it would be the case for calling a service after a DAL response (case b)

Based on this issue and subsequent mailing list discussion, we will allow access_url or service_def in the links table but not both in PR-DALI-1.0-20140925. To use a service, the client must do it via the descriptor. -- PatrickDowler - 2014-09-17

Author response (2014 July 27th) by FrancoisBonnarel

* Jose Enrique pointed a difficulty with the "service_def" paragraph. The potential inconsistence (or redondancy) between the acces_url in the* {links} resource response table and the accessURL in the service descriptor. The accessURL in the service descriptor should be generic. In the response table the access_url is always attached in some way to the dataset, either implicitly or by fixing some id parameter. Maybe we should use an example based on an http Get parameter instead of REST ?

Page 16. http://www.ivoa.net/rdf/datalink does not exist. 404

Author Response(2014 July 17th) by MarkusDemleitner

In this case, a 404 is almost fine, as the URL really only defines a name space, and in this role there's not requirement it resolves to anything at all. In our case, though, we promise there's an RDF file there that would let people figure out semantic relationships between the various terms that are there (e.g., a "flatfield" is some kind of "file used in data reduction").

Things still work with the 404, but it'd suck if it were there at REC time. So, is anyone actively working on getting the vocabulary in? Can we discuss it a bit, too?

The vocabulary page has since been created (no longer 404). Source is available at https://code.google.com/p/volute/ under trunk/projects/dal/DataLink/datalink-terms/ but still needs some work. -- PatrickDowler - 2014-09-17

Group member response (2014 July 18th) by MarcoMolinaro

Is this the same point on which you already asked (or was François? someone else?) for contributions in terms of which predicates this vocabulary should * *contain as a minimum set? It was some time ago, I don't remember properly. I think we need it if we want client applications to act smart upon links, leaving it totally free to the providers can make this field quite useless.

Can we build on the existing ivoa.net/rdf existing vocabularies? E.g. the 'flatfield' example can fit the obs.calib.flat@en UCD vocab concept?

(BTW: should the link of the vocabulary point to http://www.ivoa.net/rdf/Vocabularies/vocabularies-YYYYMMDD/datalink or maybe http://www.ivoa.net/rdf/datalink/YYYYMMDD/datalink or similar?)

-->

Page 16. 3.2.7 conte_type and content_length In the case the link is a pointer to a an ad-hoc service, it may happen that content_type and content_length cannot be defined before calling with a specific input params chosen by the user. I'm thinking of a service that generates images on-the-fly, and based on the input params this result image may be very different in size, and its format may be png, jpeg or fits. Which values for conte_type and content_length for these cases? blank?

Author Response(2014 July 17th) by MarkusDemleitner

Yes to blank/null. I'd argue that is implied in the required=no in Table 1. I seem to remember there once was prose making this a bit more explicit in previous versions; I'm not sure how much I miss it now.

Page 16. 3.2.8 content_length I would use unit="Kbyte", much more practical and user-friendly.

Author Response(2014 July 17th) by MarkusDemleitner

This is a protocol, and hence users will not usually see the raw table, and hence the unit chosen doesn't really matter; it's up to the clients to format and display this information, if at all. Except with Kbyte we wouldn't leave the realm of 32-bit integers quite as quickly.

Which made me notice we don't define the type of content_length yet. I think we should at least make a recommendation. My first choice would be "long", which in VOTable is a 64 bit integer, and unit="byte" will do fine then [quick: how many 2014 hard drives can you fill before VOTable longs warp over with the number of bytes stored? Assuming one hard drive weighs 100g, express the mass of that storage cluster in solar masses].

If people are worried about interoperability of such longs and were to advocate int, I'd say unit should be kbyte (decimal prefix) with commercial rounding or so.

For float, it wouldn't really matter and I'd go for byte again.

So, which would it be?

-->

Group member response (2014 July 18th) by MarcoMolinaro

I'd vote for long/byte. If we're to use kilobytes for the unit, however, we have to decide between: kbyte and Kibyte, at least to follow VOUnits (since it's a recommendation).

Clarified that the content_length column is datatype="long" and unit="byte". We had the same discussion about units in the TAPRegExt (chose byte as specified by VOUnits) and all the same arguments apply. -- PatrickDowler - 2014-09-17

Page 18. Table 2: Error Messages I do not think a NotFoundError may be taken as an error, but as a zero results response (as it is the case for most DAL services) Moreover, the zero response result may allow the inspection of the number and nature of the rows of the VOTable, in the case this response is always the same for any ID.

Author Response(2014 July 17th) by MarkusDemleitner

With not-found situations the server may want to add some explanation ("This identifier is not from this site" versus "We seem to have lost this file"). We should at least provide it with a means to do this, hence the NotFoundError.

Whether it's a good idea to mandate at least one row per ID (up to the match limit) and have errors in every case may not be quite as clear-cut. I have to say I'm on the side of one row per ID, but I don't have terribly strong arguments for that. Well, of course there's the general rule that silent failures are bad. Except when they aren't and silent failures are what preserves what's left of the user's sanity. Hm. No easy answer.

Note from PatrickDowler: In the previous version these messages were changed from *Error to *Fault (following VOSpace style); I failed to note this change in the change log, but it was in the first PR. A typical implementation of this would be to declare exceptions with these names and in Java (at least) Error has a very specific meaning such that declaring a normal application exception with that name was very wrong.

Answer (2014 July 21st) by JoseEnriqueRuiz

Ok, though some could argue that DAL services in general do not provide these kind of messages for "zero records found" responses in multiple-valued params queries. I guess, they just simply skip to next value.

If I follow your arguments, I would say we could have different explanations for errors found when creating different links for the same ID. (i.e. some services not designed to work with a specific dataset)

The purpose of having at least one row in the links table for each input ID is to differentiate between those that failed to generate any links and those that may have been skipped because MAXREC was reached (can be multiple ID values in the request). Given that the links capability input is specific identifiers, it isn't a normal query/search result where nothing is a possible answer: in this case not finding an ID means an error or the service hit the limit. -- PatrickDowler - 2014-09-17

Page 20. bottom of the page <PARAM name="resourceIdentifier" datatype="char" arraysize="*" Is resourceIdentifier really required/mandatory for a DataLink service?

Author Response(2014 July 17th) by MarkusDemleitner

No -- and you're right, that should be made clearer at this point.

The table in 4.1 clearly states that resourceIdentifier and standardID are both optional in a service descriptor. In 4.3 (page 20) that is an example (of a registered SIAv1 service)... the custom service example only has an accessURL -- PatrickDowler - 2014-09-17

Page 21. top of the page value="ivo://ivoa.net/std/DataLink#links" /> but value provided in Page 10. is ivo://ivoa.net/std/DataLink#links-1.0

Fixed. -- PatrickDowler - 2014-09-17

Page 21. -24. 4.3 Example: Service Descriptor for an SIA-1.0 Service 4.4 Example: Custom Access Data Service

Should we add use="required" to PARAM tags describing mandatory input params?

Author Response(2014 July 17th) by MarkusDemleitner

use="required" isn't available in VOTable. And I'd argue that's not a big loss anyway, as typically relations between parameters are more complex than that ("if you give RA_MAX, you cannot give any of PIX_*"). We know how to say these complex things in PDL, and I'd hope in a future version we can add VO-DML-based PDL annotation to the the PARAMs that would be able to express this kind of thing.

--> Answer (2014 July 21st) by JoseEnriqueRuiz

Ok, fair enough

-->

I would add one example of ref="columnID" (other than the obs_publish_id) to one or several PARAM tags describing an input param whose value is taken from the tabular data present in

I would stress on the fact that the Service Descriptor syntax allows also providing default values, which facilitates the use for a client.

Page 24. 3rd p. 9it is related to photometric or flux calibration).

Fixed. -- PatrickDowler - 2014-09-17

Finally, a major point. I think it would be very useful to give the possibility to add a describing in detail a tabular response of a Custom Access Data Service. Self-described web services in terms of I/O params opens the window to web services interoperability, going beyond data interoperability.

Author Response(2014 July 17th) by MarkusDemleitner

I'm not sure I find this convincing -- for one, most of the services described by datalink groups probably will put out data that's not obviously tabular in nature (i.e., images and such). For two, the output column metadata in tabular data should really, really be contained in the response (as in VOTable and to some degree FITS binary), which is where the clients should get it from.

For discovering services by output table structure ("which services return normalised fluxes?"), that's admittedly not good enough, but that's a Registry problem (which I still don't consider terribly relevant to *data*link).

-->

Answer (2014 July 21st) by JoseEnriqueRuiz

For one, in my view, this is not a reason to forbid this optional use. I could say many DataLink services will not provide links to adhoc services, and this does not forbid the use of the adhoc services description syntax in the DataLink response when it is needed. _ For two, Yes. VOTables should be accurately self-described, though I do not see why this should go against describing them also in services as their output._

I see DataLink as a very powerful way to discover generic adhoc services not present in the VO Registry. In this sense, I find DataLink somehow related to service discovery usecases. This is why we are talking about things like [use=required] (present in VOSI-capabilities but not in VOTable), VO-DML-based PDL annotation, and descriptions of service outputs.

In my opinion, the description of a service would benefit from a syntax that also allows the description of its outputs (goind beyond the human-readable text in the description field of DataLink response), and for tabular outputs the solution is quite straight-forward and simple, so why forbiding it?

We have in DALI the MAXREC=0 mechanism to provide description of service outputs, where the service is not required to execute any specific request (just a mean to provide a simple hard-coded description of the tabular outputs) I guess this mechanism has been adopted and approved because there are use cases behind, nad I guess they may also be valid for DataLink..

-->

In the same spirit, I think we should agree on a optional mechanism to provide a detailed description of the number and nature of the links given by the datalink service (rows of the reponse VOTable), in the case this response is always the same for any ID.

Author Response(2014 July 17th) by MarkusDemleitner

This sounds interesting and the first requirement that might necessitate a registry extension for datalink. I don't think anyone is wild about having to define one, and the document has been careful not to introduce some dependency on it, but if we collect use cases that call for it, it's probably not prohibitively hard to do, either. What use cases do you have in mind that would be solved by such a description?

-->

Answer (2014 July 21st) by JoseEnriqueRuiz

Well, I'm not going that far.. (registry extension) I think this could be solved just adopting the mechanism Ok, all use cases are based on those DataLink services that provide the same number and the same nature of rows for any ID/ObsPubDID. These DataLink services may be seen as serices that always offer the same specific pack of links for every dataset.

For example, consider three different data providers as three SIA services. One person would like to know that for the first SIA service the complemetary DataLink provides a set of links with progenitors and provenance metadata, for the second SIA service the proposed DataLink service has a very different nature providing cutouts and one specific analysis service, while the third DAL service offers only related bibliography through a different DataLink service. These different natures of these two DataLink services could be known in advance before actually calling the DataLink services.

The specific nature of complementary DataLink services should not be at all restricted or categorised, just think on any potentially accessible resource in the web that could be linked, even outside the VO-world: related bibliography (ADS), SIMBAD or NED objects in the FoV, non-VO services like those coming from SDSS or SkyView, or even simple doc-like HTML pages..

-->

Clarified the introduction of service descriptor to say that it does not currently support any description of the output but this may be added in a future minor revision. We should see what GWS (PDL and/or WADL?) and DM (vo-dml) can tell us and people who need this can prototype something (I see no harm in someone adding another GROUP in their service descriptors if they need it for an application or demo. And we shouldn't be scared to immediately start working on 1.1 -- PatrickDowler - 2014-09-17
As an aside, the specification does not forbid providers from adding additonal PARAM or GROUP elements to the service descriptor and doing so should be harmless to all but the most poorly written clients. By not saying anything about output and other "missing features" we are free to experiment with adding extra metadata to service descriptors and then showing what it can do at a subsequent interop demo. For myself, I will be looking at how to convey different accessURL(s) that support different authentication schemes and at how to tell the client that both GET and POST are supported. I feel that I can add such things to my output without making it non-compliant. -- PatrickDowler - 2014-09-25

Comment (2014 July 18th) by PierreFernique

1) General comment: Despite the introduction, the difference between the two datalink methods has not be clear for me. Both are called "datalink" and it is difficult to understand the difference between the two methods with the same name. I suspect that there were various author point of views and no definitive choice. Why and when we have to use one method or the other one would be helpful for future implementors.

Author Response(2014 July 21st) by MarkusDemleitner

I'm not quite sure what you mean by "two methods" -- standalone datalink vs. service descriptor in DAL? If so, I wouldn't call that two methods, but we should obviously do a better job laying out how things are supposed to work in both access scenarios. Do you have (possibly high-level) suggestions on how we could improve the text?

-->

Answer (2014 July 21st) by PierreFernique

I just cite the first sentence in the introduction "DataLink defines two distinct but related data-linking mechanisms" (well mechanisms... methods...) : "service descriptor resource" and "links". And after this introduction, it is not clear for me where and when we have to choose the first "mechanism", or the other one, or both. May be a simple example could help the reader.

-->

Response (2014 July 27th) by FrancoisBonnarel

I am convinced that putting the two "linking" mechanisms described at exactly the same level is not clarifying the whole thing. Clarifying this requires some changes in the introduction. Norman also pointed this before interop and I supported the idea to change the introduction at that time. The introduction should start by invocating the {links} resource. A few exemples of usage should be given. The dataLink name should be restricted to this resource. And service descriptor is ..... "Service descriptor".

The service descriptor should be introduced historically like Pat did, in his response on the list, starting from the need of declaring the DataLink service in a DAL query response. A few other examples could be given for service descriptor to illustrate where they are usefull.

2) Technical questions: Concerning the second method (with the PARAM definitions by GROUP): The possibilities opened by this method is very promising. At the first view, it seems simple and flexible, and very useful to build on the fly associated user forms.

a) However, I do not see how to describe REST links, or any URL for which the prefix depends of the parameter values. May be I'm wrong, but it seems that this method can only build URL on this template : http://static_url_prefix?param1=val1¶m2=val2... However, it is quite common that some servers provide their collections on this basis URL template : http://host/variable_path/datasetID (VOSpace links ?). It would be great if basis URLs could be also described by this datalink method.

Group member Response (2014 July 18th ) by MarcoMolinaro

I think this is a good point, and maybe it doesn't affect only Datalink, query interfaces from most of the protocols work in the HTTP-GET way. I don't see how this can be answered now, but maybe we can take it into account for future revisions at a higher level than the simple protocol.

Author Response(2014 July 21st) by MarkusDemleitner

Hm. I'm not wild about this -- much as I appreciate good-looking URLs, I think allowing this is not going to make a better standard. In particular, I'd claim that even if people actually ran such services already, they'd have to write wrapper code anyway in order to make it datalink compliant. That wrapper code would have to do the conversion from IVOID (which should be what's passed in through ID) to their local datasetID, and then going from HTTP parameter to URL part should be straightforward (i.e., of the order of two lines of code). I don't think a complication of the standard is warrented there.

-->

Answer (2014 July 21st) by PierreFernique

I have to say that I'm not a very keen supporter of the REST paradigm, but as the document seems to follow this recommendation (introduction page 5), and as VOSpace is RESTfull, it is surprising that the links to the REST servers will be not supported "natively" by the protocol. And in any case, a wrapper is generally a "last resort" solution, rarely implement directly on the server side, and very badly maintained in the long term (my experience).

More generally, as I partially said, the GROUP mechanism does not take into account any variable URL prefix (before the '?'), nor HTTP parameters without any value (flags) or combination of values in the same parameter. It will be a potential issue.

A few existing URLs possibly used in a datalink response for which the GROUP mechanism won't work (red fields)...

1) any VOSpace URLs 2) Dedicated "static" HTTP trees

http://www.cadc.hia.nrc.gc.ca/data/pub/HSTCA/u21x0102t_prev.jpg

Note: The above URL is to a custom CADC service for archive data delivery, which has nice pretty URLs. In our DataLink prototype you would see this URL as an access_url value in the links table and would not have to construct it. -- PatrickDowler - 2014-09-22

http://alasky.u-strasbg.fr/SDSS/DR9/color/Moc.fits

3) A lots of cone search servers :

http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/sia/CFHT/query?POS=83.633083,22.0145&SIZE=0.2333&FORMAT=image/fits&VERB=2 http://wfaudata.roe.ac.uk/ukidssdr9-siap/?POS=83.633083,22.0145&SIZE=0.233&FORMAT=image/fits http://vo.imcce.fr/webservices/skybot/skybotconesearch_query.php?&-ra=83.633083&-dec=22.0145&-size=28.0,28.0&-mime=votable&-out=basic&-loc=500&-search=Asteroids+and+Planets&-filter=120+arcsec

4) Dedicated services:

http://alasky.u-strasbg.fr/footprints/cats/vizier/B/DENIS?product=MOC&nside=512

5) TAP services :

http://geadev.esac.esa.int/tap-dev/tap/run/tap/sync?REQUEST=doQuery&LANG=ADQL-2.0&QUERY=SELECT+TOP+1000+*+FROM+gums.mw+WHERE+1%3DCONTAINS%28POINT%28%27ICRS%27%2C+alpha%2C+delta%29%2C+CIRCLE%28%27ICRS%27%2C+80.89417%2C+-69.75611%2C+0.2333+%29%29

I think all of these (3-5) can be described by the service descriptor. Of course, the descriptor would not likely contain enough info to actually write an ADQL query without doing some tap_schema queries first, but it could convey allowed values of LANG and limits on MAXREC... -- PatrickDowler - 2014-09-25

-->

Author response (2014 July 27th) by FrancoisBonnarel

service descriptors and RESTFULL services : Pierre is right that the current version of the description doesn't allow to describe all sort of services, for example REStfull interfaces. Including in the service descriptor metadata to describe template roots or pathes in Restfull URL is probably out of the scope of the first version. (But maybe not of the second one as Pat said in his response) However right now in the descriptor we should at least add a "URL type" PARAMETER in the descriptor, beside . I see at least three possible values for this param: HTTP GET, RestFull, Mixed. Current version of the draft only details the "HTTP GET" case with the "InputParams" GROUP.

Describing RESTful service invocation requires more than just template URLs with variables in the path. You also need to describe the semantics of the HTTP verbs, what the do, and what the input is, and that is not always a parameter or just the path and get. In VOSpace, there are several capabilities and they take different xml documents as input. I don't see this being feasible without the client grok'ing the standardID at this point in time. As above, we should wait for 1.x (priority is to get data discovery -> access to work, and that means TAP|SIAv2 + links|descriptor + AccessData) and see what GWS can do for us. -- PatrickDowler - 2014-09-17

> b) Also, I did not find any thing concerning HTTP encoding requirements. May be a short paragraph could avoid some stupid future bugs (param=val must be correctly HTTP encoded..., the & character must be used as parameter separator)

Group member Response (2014 July 18th) by MarcoMolinaro

I'd limit it to recommending proper HTTP encoding, that should be sufficient for HTTP GET requests ('&' is not the only possible separator, ';' can also be used for nested GET...I don't see where we can use this latter case, but...)

Generic recommendations about HTTP level stuff was removed from spec(s) in an earlier WD because everyone is supposed to just know and do this right now (eg there is plentyof good advice and best practices available, all we can add is bad advice :-); I am reluctant to add this back in again. -- IVOA.PatrickDowler - 2014-09-17

c) One point concerning "blank or missing value". In my experience, some servers have no ALL option for such or such parameter : just the lack of the HTTP "param=xxx" parameter meaning "ALL" for this constraint. Removing the "¶m=" parameter of the URL in case of blank or missing value can be a solution for managing these cases.

Author Response(2014 July 21st) by MarkusDemleitner

What you're saying is that we should say, at some suitable point:

Parameters not used in a service invocation should not be passed at all, rather than with empty values.

I'd somewhat have expected that to be implied -- do you really think people would pass what in effect are empty strings? I'm not opposed to the prose, just a bit surprised that it might be necessary.

--> Answer (2014 July 21st) by PierreFernique

I totally agree that HTTP parameter with empty string is rare, but mainly because, in this case, the parameter is just fully removed ("¶m=" is removed). I suggested to be able to support this common case.

http://masthla.stsci.edu/hla/Footprints/aptfootprint/Footprints.svc/Footprints?POS=83.633083,22.0145&SIZE=0.4666666666666667,0.4666666666666667&INST=&LEVEL=Best

-->

3) My wish. In IVOA we use frequently VOTable as a container (SIA, SSA, TAP, ObsTAP, and now Datalink), but without magic code or any signature to recognize that this VOTable is a Datalink result, or a TAP result or whatever. And concretely it is a nightmare for client which are supporting simultaneously several of these protocols. I recommend to introduce in our VOTable protocols (at least the new protocols) a signature which could be a simple INFO tag (ex: ).

Group member Response (2014 July 18th) by MarcoMolinaro

I agree on this wish. S*AP, TAP (e.g.) are interrogated and answer directly, but Datalink opens up the scenario. _In principle a client will always be able to know in advance what type of service it is querying (Datalink provides standardid), but a specific signature can turn out to be useful.

Author Response(2014 July 21st) by MarkusDemleitner

Although I'm always a bit nervous when we put in the same information in two places (in this case, the content-type header of the HTTP response and then later the HTTP payload) I think I like that a lot, mostly because I expect people might store datalink responses and re-use them later, when there's not HTTP header any more.

So, I'd say we should have a new subsection in Sect 3 (I'd say it should become 3.2, but if people are worried about renumbering subsections at this late stage, 3.5 would be ok with me, too):

3.2 Protocol declaration

To help clients dispatch between various internal recipients of VOTables even in the absence of HTTP header information, datalink responses serialised in VOTables MUST contain an INFO element with a name SERVICE_PROTOCOL and a content of "datalink" as an immediate child of the VOTable element; the strings are interpreted case-sensitively Services SHOULD declare the version of this document they conform to in the value attribute of the INFO element.

VOTable responses from datalink 1.0 services would thus contain:

<INFO name="SERVICE_PROTOCOL" value="1.0">datalink</INFO>

(This is fashioned after SSAP) What does everyone think? Should something like this get into VOSI? The "dispatch according to content"-thing appears to be something quite frequent, and offering a general, non-heuristic method to do it sounds like good sense to me.

While it is a great idea for every document to say "created by capability <standardID>", it belongs in DALI-1.1 or, alternatively, vo-dml would provide a way of saying "these records are instances of..." which is much the same kind of statement. Not adding this. -- PatrickDowler - 2014-09-17

* (Member Comment - DaveMorris, email 2014-09-26) I like the idea of declaring what type of service created the table, but it would be better to just use the URI for the standard.

    <INFO name="SERVICE_PROTOCOL">ivo://ivoa.net/std/DataLink#links-1.0</INFO>

That enables us to use the same mechanism to refer to new types of services we haven't invented yet.

    <INFO name="SERVICE_PROTOCOL">http://wiki.ivoa.net/twiki/bin/view/IVOA/NotInventedYet20140926</INFO>

Question (2014 July 21st) by JoseEnriqueRuiz If "empty value" means "ALL values", one question rises here: How to make a query to gather those datasets with param=empty/NULL/blank ? I do not know if there are use cases for this :-/

It is true that HTTP query params do not provide any obvious way to interpret something as "all" vs NULL vs "" (zero-length string), but I don't think we can solve this here. This is probably a query problem to be solved by (e.g.) SIAv2 query capability and if we want to write that solution/spec once then it belongs in DALI (or the next higher authority: VOSI This certainly is not a problem for the links capability. -- PatrickDowler - 2014-09-17

Answer (2014 July 21st) by PierreFernique

no HTTP That's the point. Also, the client can have HTTP API which does not provide easy access to the HTTP header fields. Marku's INFO TAG: Sounds good

-->

* Editor Response (2014 July 21st) by PatrickDowler

I think the idea of describing service output in the DataLink service descriptor is interesting and it is something I thought about. The current use cases revolve around two things:

1. The {links} response solves a variety of discover->download issues such as (i) multiple files per dataset, (ii) alternate representations like previews, (iii) related resources (other sources of metadata, services that can act on the data).

2. The service descriptor was originally conceived to solve the problem of getting from a discovered ID (eg in a TAP or SIAv2 query response) to the {links} resource itself without having to resolve the ID via registry lookup....

We quickly realised that with minimal additional metadata we could use the same mechanism to go from the discovered values to any service that took them as input (the 3 service params and the inputParams)...

We also realised that the {links} response, since it is a votable, could also use service descriptors to describe services (typically lower level access services). Of course, one can put such links directly in the data discovery response if the cardinality of their discovered records and services matches (eg if one identifier in the discovery response can be used to call a service, then you can tell the client about it). That's the whole thing about links: you can add them anywhere you have an identifier that can be used someplace else! But that was not new spec, just new usage.

But, this is all aimed (currently) at forward-linking and how to describe the call to the service. We have not tried to describe what the service will actually do nor the response it might create. For now (1.0) I don't think we need it for the use cases at hand. Further, since we probably do want to add it later, I feel strongly that we should not add any simplictic form that we might regret. As has been mentioned elsewhere, PDL, VO_DML, and several other new-ish things cover some common ground and we should take the time to consider them and prototype.

I think that means adding desciption of the output in DataLink-1.1

* September 10th 2014: a compilation of Discussion among authors on top of FrancoisBonnarel remarks (July 27th)

> 2 ) service descriptors and RESTFULL services : Pierre is right that
> the current version of the description doesn't allow to describe all
> sort of services, for example REStfull interfaces. Including in the

Answer by MarkusDemleitner

Well -- that's for the service descriptor, and this is obviously only relevant when a descriptor is embedded in DAL resoponses. In datalink responses you're free to pass whatever URLs you fancy.

For DAL responses, the things described are in all likelihood post-processing services (e.g., cutout, recalibration), and I doubt many services exist that do these kinds of things with a URL schema violating what the service descriptor can do.

On the other hand, you're explicitely free to describe a datalink service itself in your service descriptor -- if your data structure is sufficiently complex, that's what I'd say you should do; that's what datalink is for: decoupling the service interface from the actual representation.

Which is to say: I don't believe we should make things more "flexible" here. Flexibility is a liability for implementations, and a liability for interoperability. I think we should have a much stronger case before adding features, much less just announcing them.

Answer to MarkusDemleitner by LaurentMichel

I agree that it is not the right time to start a discussion about a general mechanism building URL templates, but the draft cannot ignore the existence of the RESTfull encoding. The strong reason for that is that REST is used by 2 VO standards likely to be involved in datalink responses at least : VOSpace and UWS. This point could be mentioned either by adding a URLType as FB proposed (see above July 27th) or by making the standardID mandatory even for free services. I prefer this second solution since *it avoids possible inconsistencies with URLType.

> 3 ) Jose Enrique pointed a difficulty with the "service_def" paragraph.
> The potential inconsistence (or redondancy) between the acces_url in
> the {links} resource response table and the accessURL in the service
> descriptor.
> The accessURL in the service descriptor should be generic.

Answer by MarkusDemleitner

I'm not sure I understand what you're saying here -- I believe we essentially have three options:

(1) force access_url (table)==accessURL (param) (2) accessURL given, access_url NULL (3) access_url given, acesssURL not in the service descriptor within a datalink response.

Although it sucks to have the same information in two places, (1) from my implementation experience is the most straightforward, and I believe we should simply mandate that. (2) and (3) I could live with. What I'm firmly against implying that there may be situations in which access_url!=accessURL -- that way lies madness.

> By the way, as a kind of shortcut, the "service descriptor resource"

Answer by MarkusDemleitner

Hm -- I don't like the "By the way" here, even in the introduction. I agree, though, that saying there are two distinct but related data-linking mechanism probably is confusing.

>However right now in the descriptor we should at least add a "URL type" PARAMETER in the descriptor, beside . I see at least three* possible*
>values for this param: HTTP GET, RestFull, Mixed. Current version of the draft only details the "HTTP GET" case with the "InputParams" *GROUP.

Answer by MarkusDemleitner

As I said, I'm against over-generalising the protocol, and in particular, building in things that appear to claim we support something that might come in a future standard -- or might, as so often in VO standards, not.

-> Discussion LaurentMichel / MarkusDemleitner

Laurent: I'm still not thinking that supporting RESTfull URLs can be considered as an over-generalisation.

Markus: Hm -- its URL templating, something we've never really tried in the VO as far as I'm aware, and something that has quite a few opportunities to mess up. After all, there are many, many URL schemes, and most I've seen are fairly funky...

Laurent: I insist a little bit just to keep open the possibility to quickly address VOSpace records with Datalink.

Finally replacing http://server/service?ID=paramValue&action=download with something like http://server/service/$ID/download where $ID is replaced with paramValue does not look so much complex.

I've no ambition about templating any sort of URLs but just GET-HTTP en REST.

Markus: Hm -- the devil usually lies in the details --for instance, how is $ID-pathcomp to be interpreted? $ID_pathcomp? Should the value of the ID param still be passed as an HTTP parameter? What's to happen if a parameter referenced in the URL is not a string? Are there any special rules for quoting these things? And so forth.

Laurent: You are right, that is why the URL building mode must be specified somewhere else. This point could be tricky and that cannot be sorted out for this document. That is why I suggest (with françois) to reserve a field specifying how the URL must be constructed and to work in GETHTTP mode until a proper way to build REST URLs is specified.

* Markus*: Meaning: If we write this into a standard (and I admit there seems to be a use case), we should have implementations first to see what can go wrong.

Laurent: A datalink pointing onto a VOSpace, I can do that.

Laurent: The document must have a little room for this possibility even if the way to do the URL encoding is not achieved in the first version.

Markus: Could you live with the language on telling clients to ignore services with an empty accessURL?

Laurent: In a general way, I'm suspicious about the idea or triggering a client action from an absence of a parameter.

Markus :...but I'd say in this case there's not much that can go wrong -- implementors just need to be aware that empty access URLs may turn up in the future. Given that they are in no risk of erroneously operating a service they have no access URL for, at least there won't be silent failures either way.

Laurent: right

Markus: That way, we can later use that as a sentinel that more complex URL building mechanics will be required, and nothing will break.

Laurent: I definitely prefer to say somewhere that the URL is HTTP-GET, REST or something else. From an implementer point of view, the right place to state that is the standardID. It is already used to know how to build URLs for VO services and its scope can be extended for non VO URLs. If we agree with that, we can postpone the definition of the new supported standardIDs and DATALINK will work as it is defined yet by the 1.0 standard.

Markus : I wouldn't have a problem with that, either. But someone would have to write some prose urging client authors to check the standard id (case-insensitively, and possibly ignoring minor versions if they don't care).

Laurent: AS far as I know, there is no standard ID referring to an external URL (e.g. ivo://ivoa.net/nostd/url) I've no idea about what is legal to do here. As I said, the first version of the protocol would just have to mention the role of the extended standardID and state the GETHTTP is taken by default . Meanwhile I'll have a look at a possible formalism for non standard standardID

Back to initial MarkusDemleitner answer to FrancoisBonnarel for URLType proposal

If we believe there's an actual place for URL templating, then we should say (provided we go for access_url==accessURL above) or If accessURL is NULL or missing, clients must ignore the service definition. This is an extension mechanism that might, for instance be used for more complex ways of URL generation in future standards.

If we really went for URL templating later, we could then say something along the lines of "for your crazy URL scheme, define accessURL-template and make accessURL null. Then do $FOO in your template yadda...".

But while we're talking: I have a really bad feeling about passing this on without more client prototypes, as least. We have something in SPLAT that we'll need to review against the current spec, in particular as regards telling datalink from data processing services (I seem to remember Margarida had an issue there, but she's on vacation right now). Who else has clients running? Non-trivial ones, even? If nobody has, how can we make them happen?

[Disclosure: Shortly before Madrid, I've started a bit of javascript that would provide a SAMP-enabled datalink+data processing client in the Browser; but when it didn't finish for Madrid for this reason and that, I let it slip again. If someone felt like this is a good idea, I'd gladly pull it out again].

_*Answer by PatrickDowler to the whole discussion*

-- 1. parameterised URLs are not going to help you use VOSpace -- there is a lot more to RESTful service invocation than URL templates.

-- 2. the more important thing we are missing is any way to describe authentication requirements as auth pretty much always means different URLs (usually different scheme and/or path).

These two points are related as they both come up when one is trying to deal with RESTful web services: it is with restful services that the access_url in the table will usually contain something different (longer) than the accessURL in the service descriptor. One solution would be to have the {links} table contain either an access_url or a service_def but not both; in the latter case, the client would have to use the service_def to find a descriptor resource and construct the URL. This would get rid of a redundancy that leads to the conflict in #3 below, and that is one that comes up with REST services like VOSpace.

As for REST services, this is quite a complex issue and I think we can't do much better than give the descriptor with standardID and accessURL (for that capability) and rely on the client knowing all that the stadardID entails. Even then, it isn't so simple...

For vospace, one would have to convey the standardID and accessURL for the {nodes} resource and the standardID and accessURL for the {transfers} resource. These have different semantics and the client needs both in principle. So, in this case maybe the right way to use "datalink" is to put two rows in the {links} table, each with the same vos URI, and with service_def values to indicate the two capabilities being described. One could only describe the normal VOSpace params (eg for {nodes] that is limit, uri, detail, and view, iirc). There is no feasible way to convey the calling semantics.

Now, to actually describe a REST service, one needs something fancier, eg WADL or things like that as discussed in GWS. The problem with those, and with PDL, is that we cannot embed them inside VOTables. We could/should discuss (soon) whether VOTable needs to allow a kind of resource that can embed such descriptors, or such things need to be designed to be embeddable in VOTable in a formal way... but that is not something for DataLink-1.0 ... for now, we should give advice and improve the vospace example.

It would be nice to be able to describe multiple capabilities in a single service_def. One thing that is missing from everywhere except VOSI capabilities is a way to describe the authentication requirements of an interface. In VODataService, I recall that the cardinality is:

1 service ... N capability (standardID) ... M interface (accessURL)

whereas our service descriptor only supports:

1 service ... 1 capability (standardID) ... 1 accessURL

and our accessURL doesn't have any additional metadata; the biggest thing missing for practical use is info about authentication, but we'll have to live with that until 1.1

An addendum regarding URL templating: the point of templating is (generally) to let clients construct a URL which they know will go direct to a resource. For that you need things like WADL, which aren't necessarily pretty, and which introduce yet another document and yet another technology. However you don't necessarily need templating, if your single/starting {link} URL produces a document which points to the other ones. If a {link} document says "_here_ is the image, here is the background, ..." then you don't need templates -- the client just needs to 'follow its nose', in exactly the same way that a human does on an HTML page. This is the 'Linked Data' idea, and is easy. -- NormanGray, 2014-10-21

Comments from TCG member during the TCG Review Period: 2014-10-01 - 2014-10-29

WG chairs or vice chairs must read the Document, provide comments if any and formally indicate if they approve or not the Standard.

IG chairs or vice chairs are also encouraged to do the same, althought their inputs are not compulsory.

TCG Chair & Vice Chair ( _Séverin Gaudet, Matthew Graham )

Requires a second reference implementation and any implementation validators to be described above. Otherwise approved.

-- SeverinGaudet - 2014-10-02

Applications Working Group ( _Pierre Fernique, Tom Donaldson )

The document is reasonably clear and readable. I do have a few concerns that I'd like to discuss before approval, recognizing with apologies that some of these would have been more constructive earlier in the process.

Reference Implementations

Creating reference implementations help ensure that a spec is unambiguous, complete, practical and interoperable. With those benefits in mind, I'd love to see some enhancements to the reference implementations and/or the descriptions of them with respect to this spec.

Are the CADC, DaCHS and SPLAT implementations official reference implementations? That wasn't entirely clear to me.

For the reference implementations, give clear examples of them demonstrating the Motivating Use Cases from the document.

The best way to demonstrate interoperability and practicality is with a client reference implementation that can consume more than one of the Datalink reference implementations (e.g., CADC and DaCHS) for as many of the Motivating Use Cases as practical. Is there such a client now? I don't think we require a test suite to go along with a reference implementation, so such a client would help achieve some of that testing to build confidence the the reference impl. has sufficient coverage.

Efficient Access to Certain Products

In Heidelberg we discussed mechanisms to provide a more direct and efficient access to certain products, with the primary example being preview images. Bascially, given a data discovery response, a client should be able to quickly know URLs to preview images for any or all of the result rows. Ideally these previews could be offered in various sizes to suit a client's needs (e.g., thumbnail for showing many at a time, medium for showing several on a page, and large for a browser-friendly higher resolution display of a single preview). Scrolling through the results of a query in the MAST data discovery portal show how a client might make use of this: http://mast.stsci.edu

I'm disappointed that this didn't end up being one of the motivating use cases. I realize it may be too late in the game to advocate for this again, but in reading the discussions above about templating URLs, etc., it seems like there may be room to work through some discussion on this.

What I'm actually hoping is that the use case I mentioned is actually supported by this document, and that I just can't figure out how. If someone could describe such an example, that would be wonderful.

-- TomDonaldson - 2014-11-05

Response: Efficient Access to Products

The link from a record in the response to a preview can be conveyed to clients using a service descriptor.

1. set and ID attribute on a field with an identifier, e.g.: <FIELD ID="IDVALUE" ... />

2. include a service descriptor for a custom "service" that returns the preview in the discovery response, e.g.:

<RESOURCE type="meta" utype="adhoc:service" ID="previews">
    <PARAM name="accessURL" value="http://example.com/previews" />
    <GROUP name="inputParams">
        <PARAM name="IDENT" datatype="char" arraysize="*" value="" 
               ref="IDVALUE" />
        <PARAM name="SIZE" >
            <VALUES>
                <OPTION value="small" />
                <OPTION value="medium" />
                <OPTION value="large" />
        </VALUES>
        </PARAM>
    </GROUP>
</RESOURCE>

3. This tells the client they can create URLs of the form: http://example.com/previews?ID=<value from te IDVALUE column>&SIZE=<small|medium|large>

It does not tell then what this service "means". That's hard (maybe in a future version). You can add UCDs to the input params to help say what they mean... The values for SIZE are arbitrary; here I chose words, but one could have made that param datatype="integer", or something else. Once a few places actually do that and if we can agree on what the SIZE parameter should be, then that service could be a standard and thus get a standardID to describe it... For a custom service, you probably want to adorn that service descriptor with some additional descriptive text (INFO element?) as allowed by VOTable.

The less efficient way to do this is to use the links response and tag the link's semantics value as #preview. It is more explicit, it allows for static URLs or service descriptors, it allows for a text description clients could display, but it is an extra call. You can send multiple ID values to the links resource in one call, so it isn't 1 extra call per discovered dataset, but it is work to implement and more things have to happen while (I assume) trying to display results. -- PatrickDowler - 2014-11-28

Reponse from Tom D:

Approved. I look forward to exploring the reference implementations. -- TomDonaldson - 2015-06-05

Response FRom FrancoisBonnarel -2015-06-12

TapHandle (2015-06-10) is now a Datalink compatible client. Query cadc or gavo Obscore tables in the two TAP servers and discover what is behind the" DataLink" cells.

Data Access Layer Working Group ( François Bonnarel, Marco Molinaro )

Data Model Working Group ( _Jesus Salgado, Omar Laurino )

Nice standard. Just a couple of very small comments.

It would be nice to have attached a full example like the one provided by Pat at: http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=caom:IRIS/f212h000/IRAS-25um inside the examples area. It is really clear how it works when you see it.
A preliminary IVOA service discovery was already described in SSAP (theoretical services) (and later extended by S3) using a similar mechanism based on a FORMAT=METADATA call. It is not clear to me if the DAL roadmap expects to replace this SSAP approcha with DataLink. The main difference is that the service description in SSAP/S3 is in the target service and not in the calling/DataLink one. This later approach could be better as you can ensure that the service description is in line with the service itself (if you put this information in the datalink service it could not catch a change in the referenced service). Maybe a dettached description like this could be use in datalink too. In any case and as per historical reasons, I would add a line to include references to these previous efforts; SSAP and S3 (http://www.ivoa.net/documents/latest/S3TheoreticalData.html)

Apart from these two points, we approve the document

-- JesusSalgado - 2015-01-28

Response:

I have clarified the section on the ID parameter so that if the caller does not provide ID values the result is an empty result table; also added an example of incuding a self-describing service descriptor (with reference to S3) which combines with the no-ID call to give an easy way to call a service and, in ths case, get a VOTable with a meta resource describing the service itself and an (empty) results resource that shows what the output would look like (inc. custom columns).

-- PatrickDowler - 2015-02-23

Grid & Web Services Working Group ( André Schaaff, Brian Major )

Approved -- AndreSchaaff - 2015-01-12

Registry Working Group ( _Markus Demleitner, Pierre Le Sidaner )

* Sect. 1: "The current version provides no way to describe the output of a service, but this may be added in a future (minor) revision of this specification." -- this is potentially of major importance in the use of the standard. Is there no way to have some mechanism already in v1.0? Even a information on mime type and type (image, spectrum, catalog) for the client to be able to plot the output or send it by samp

Response:

Yes, this is really quite hard and even simple things like telling the MIME type are not straigtforward. What people can do is simply add additional PARAM elements to their service descriptors in an effort to prototype this kind of thing. It won't mess up a standard client and we can see what is useful and actually works before the next version. Right now it is just a matter of scope amd solving the more pressing use cases first. -- PatrickDowler - 2014-11-28

* I find the table in the opening of section 2 somewhat confusing.
Wouldn't it be more understandable to have an enumeration as in:

The following resources must be provided by a datalink service
(and can be discovered at the capability endpoint):

* <base-url>/<name> -- one or more endpoints with operator-defined
names returning datalink documents. <name> must not contain
slashes
* <base-url>/availability -- VOSI availability endpoint
* <base-url>/capabilities -- VOSI capabilities endpoint

Defining that <name> must not have slashes would also provide a means
to find out the VOSI URLs from a links URL, which otherwise seems to
be impossible using the mechanisms defined in the current document.

Response:

We do not want to restrict the resoure names (paths) that implementers might use for their links resource. The examples show that there can be different protocols, different paths, etc and these may be constrained by the application servers, policies, etc... so we want to avoid placing any limits that are not necessary (like resource name is a single path element).

As for finding the VOSI resources from a known links URL, this is a general problem effecting other services as well (with the exception of TAP, which mandates specific resource names). We did not try to address this, and my personal opinion is that this kind of navigation is probably never going to work out in the general case. However, one usually gets that links URL from a service descriptor, which can also contain the resourceIdentifier (if the service is registered) so one could find VOSI endpoints that way. I think that is the right way to think about finding sibling" resources: go up to the whole resource record first. So really, one needs to start at the top (a registry record). A RegTAP query could enable this. -- PatrickDowler - 2014-11-28
Update on this issue: We have decided that we can impose a rrstriction that {links} resources must be siblings of the VOSI resources. This will allow a fairly naive client to parse any {links} URL and create a VOSI-capabilities (or -availability) URL in order to discover other endpoints. A quick prototype showed that a service with different authentication methods can be implemented to comply with this restriction. In the case of authentication with x509 certificate (IVOA SSO) the service provider probably has to make the same VOSI resources available by both http and https, but that tends to be the default when you implement https anyway so it shouldn't be an issue.

* should GET and POST be implemented for datalink ?, if not, information should be inside VOSI capability

Response:

As a DALI-sync compliant resource, the links resource accepts both GET and POST requests (mandatory). The last sentence of 2.1 is supposed to make this clear. For resources in service descriptors, if there is no standardID to say more the client has to assume that only GET can be used. This is something we will likely try to address in future. -- PatrickDowler - 2014-11-28

* for Link resource 2.1 I what about a REST full aproach
http://example.com/datalink/links -- anonymous access
http://example.com/datalink/links/auth -- basic authentication
http://example.com/datalink/links/auth/x509 -- that redirect to https x509 authentication
http://example.com/datalink/links/auth/shiboleth -- that redirect to https shiboleth identity federation authentication

Response:

Addressed above (unrestricted resource names). -- PatrickDowler - 2014-11-28

* ID 2.1.1
refer to [1] => http://www.ivoa.net/DALI/ give a 404
should be http://www.ivoa.net/documents/DALI

if the form should be
http://foo.bar/datalink?ID=ivo://example.org/data?
put an example in the text

Response:

Reference will be fixed. Such an example of GET implies that this is the way to do it, when POST is also supported (see above). I can add it if it really helps but the service interface seems to simple to need it. -- PatrickDowler - 2014-11-28

* The content of 2.2 is already covered in the opening paragraph of
section 2. Is it actually necessary to repeat it here?

Response:

Yes, that section doesn't have any substance. It is there so that the table of contents lists all the resources relevant to the spec. -- PatrickDowler - 2014-11-28

* 2.3 capability
In the example in section 2.3. vod: is being used as the prefix for
http://www.ivoa.net/xml/VODataService/v1.1. The spec shouldn't do that,
as the recommended canonical prefix for VODataService 1 is vs:. So, it
should be xmlns:vs= "http://www.ivoa.net/xml/VODataService/v1.1" and
further down xsi:type="vs:ParamHTTP"

Response:

Will fix namespace prefix in examples. -- PatrickDowler - 2014-11-28

* 3.2 list ok link, better use Link response
content_length why don't you use access_estsize from obscore ?
content_type why don't you use access_format ?

Response:

We had this discussion awhile back (maybe Heidelberg Interop) and kept this terminology because the meanings are different (these are actual not estimated/approximate size and actual MIME type where access_format is not necessarily restricted to MIME types). -- PatrickDowler - 2014-11-28

* general question :
Do the authors forsee a necessity to discover in the Registry whether
an Obscore or DAL service has a datalink service? Should we maybe
recommend (or require?) that such services should have the datalink
capability as part of their registry record?

Response:

It is unclear if links resources will be registered capabilities. They can be registered but they don't have to be to be useful (using the service descriptor). Some of the authors (PatrickDowler for example) think that properly formulated publisher dataset identifiers could be resolved to find the appropriate DataLink service... but that requires some more work. -- PatrickDowler - 2014-11-28

Having said all that, we trust the authors will duly consider our
concerns. Based on this trust, we approve the document.

Markus & Pierre

Thanks. -- PatrickDowler - 2014-11-28

Semantics Working Group ( _Norman Gray, Mireille Louys )

The document looks good -- the right length, and clear.

I've just looked at the DataLink PR document with a particular REST-shaped question in mind. I found that I couldn't get a satisfactory answer from the document.

I wanted to ask: would it be possible for a client to ask for the links response in a format other than VOTable, via an Accept header, and would it be permissible for a service to provide it in a different format? In my mind, obviously, is allowing a service to provide a Linked Data style response, meaning that the response is in one or other RDF syntax. (DataLink is a poster-child Linked Data application -- note, I'm not suggesting that it's a priority to do this, but I would hope that the spec would make it permissible for an intern to implement it one afternoon in future)

1. A REST-style GET of this URL would imply that the client could make its GET request with an Accept header. If that's 'application/x-votable+xml', that's fine, but it should be at least permissible to give a different Accept header (such as text/turtle, for example). If the service can't supply that, it's supposed to reply '406 Not Acceptable'. I can see that it would be permissible to request a links resource with ?RESPONSEFORMAT=text/turtle, and permissible for a service to reply with such content (so the answer to my original question is 'partly yes'). Is it permitted, however, for a service to respect the Accept header? (this would probably be a more normal pattern in a Linked Data context). My reading of Sect. 3.3 "Unless the incoming request included a RESPONSEFORMAT parameter requesting a different format, the content-type header of the response MUST be application/x-votable+xml" is that the answer is no. The discussion above includes suggestions that implementers should be aware of, and respect, well-known HTTP semantics, which I take to mean allowing the full range of HTTP interactions, including Accept-based retrievals (pace this discussion, it might be worth a remark in the document to remind server- and client-side implementors that there's this slightly different style possible). The cross-reference to DALI (specifically its Sect 4.2) implies I think that the 406 error would be permitted (as being a normal HTTP error code).

Response:

The spirit of the specification is that an implementer can extend the behaviour so their service does more, and honouring normal HTTP usage falls into this category. The Accept header in particular provides an excellent example of why having two ways to do something (HTTP Accccept header vs RESPONSEFORMAT parameter) is almost always bad, because to support both in the spec we need further language on what to do if there is a conflict. The spirit of the spec is aimed at working for a normal client following it, so the quoted rule in 3.3 applies if a normal client makes a request (eg. no RESPONSEFORMAT and no other way of asking for something else -- like Accept). Service providers should feel free to do anything that doesn't break a normal spec-following client. -- PatrickDowler - 2014-12-02

2. Specifically, if I supply Accept:foo/bar in my GET request to http://example.org/foo/links, should I get a 406 response, rather than a 200 VOTable? (I think the answer is 'yes'; and as noted above this would be permissible).

3. I will ritually remark that the x-* media subtype is deprecated, and that the process for registering new subtypes (such as application/votable) is intended to be streamlined compared to what it was before.

Response:

The VOTable media type is defined in the VOTable specification; we just using it here. -- PatrickDowler - 2014-12-02

4. Clarity: It might be worth a cross-reference from Sect. 3.1 to the discussion in the second paragraph of Sect. 3.3. They overlap in what they're saying, but the latter contains a much stronger warning against a dumb string comparison than the former.

Response:

Thanks. Will clarify or cross-reference. -- PatrickDowler - 2014-12-02

[ these are, in referee terms, 'minor corrections', so although I'm not saying that this is a blocker, it might be worth some edits if there's an opportunity. ]