Difference: DataLink11RFC (1 vs. 35)

Revision 352023-11-16 - MarkTaylor

 
META TOPICPARENT name="IvoaTCG"

DataLink 1.1 Proposed Recommendation: Request for Comments


Introduction


DataLink describes the linking of data discovery metadata to access to the data itself, further detailed metadata, related resources, and to services that perform operations on the data.

The main changes in v1.1 are

  • Generalize by adding use cases for links to content other than data files
  • VOSI-availability and VOSI-capabilities endpoints are now optional
  • Service descriptors can include exampleURL and contentType param(s), as well as DESCRIPTION, name, etc...
  • Added optional link_auth and link_authorized to signal whether authentication is necessary to use the link
  • INFO element with standardID mandatory in {links} response
  • Added content_qualifier FIELD to inform on the nature of the link target
  • Added local_semantics to identify similar links in the same DataLink service for different IDs
  • Mechanisms to recognize {links} endpoints outside ObsCore

Latest version of DataLink can be found at:

The GitHub repository for issues and source can be found at: Detailed discussion towards 1-1 can also be found on this ivoa twiki page ( last update by FrancoisBonnarel - 2023-05-04):

Reference Interoperable Implementations

Server side

GAVO implemented the following changes = content_qualifier and local_semantics, service descriptor additional features such as DESCRIPTION? name, content_type, etc....

As a matter of example, a couple of links response for various obscore/sia/ssa tables in GAVO server

All these are combined with various semantics or content_type values

-- MarkusDemleitner - 2023-07-10


CADC has implemented the following in https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops:

  • INFO element with standardID in links response
  • new optional fields in links response: local_semantics (no content yet but can be populated with default vocab in most cases), content_qualifier (no content, not likely to use), link_auth, link_authorized
  • contentType param in service descriptors (where applicable)
IRIS image: https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/IRIS?f212h000/IRAS-25um shows link_auth=optional and link_authorized=true because one can authenticate but the data is public.

new CFHT data: anonymous use of https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/CFHT?2773629/2773629o shows link_auth=optional and link_authorized=false because the data is still proprietary and the caller is anonymous; if an authorized user makes the call they will see authorized=true. It's hard to demonstrate that for a general audience.

The core CADC implementation is available as a library (cadc-datalink-server) in MavenCentral with source code at https://github.com/opencadc/dal.git; the caom2-specific logic is available in a library (caom2-datalink-server) with source at https://github.com/opencadc/caom2service.git -- the core lib is also used in ALMA DataLink service but may not yet be released with the latest features.

Client side

TOPCAT v4.8-8 and later displays additional features in service descriptor and makes use of additional links table FIELD such as content_qualifier, authorization and local_semantics; The tool behavior is adapted to the content of these new FIELDS. For example Activation Actions suggested by topcat for the links not only depend on content_type but also content_qualifier, and local_semantics is used to guess which link a user is interested in based on previous selections.

AladinDesktop is going to adapt to those new features too. (see prototype screenshot)

The CADC DownloadManager (https://github.com/opencadc/apps.git) includes a simple DataLink client class so it can resolve publisherID values into 1..* URLs for download; this code hasn't changed as a result of DataLink -1.1. The CADC AdvancedSearch web portal makes calls to the above caom2ops/datalink service to find previews and download info for each row (publisherID): it makes use of link_authorized to decide to display the download options (or not), which prevents users from selecting downloads/links when they are not authorized and the request will be rejected later.

Implementations Validators

The following validators are available for DataLink

Changed:
<
<
  • datalinklint which is part of STILTS. From STILTS v3.4-8 there is a version parameter that can be set to 1.0 or 1.1 (effectively defaults to v1.1)
>
>
  • datalinklint which is part of STILTS. STILTS v3.4-9 contains DL 1.1 validation features, but later versions (at time of writing, post-3.4-9 pre-release) recommended as slightly updated for PR-DataLink-1.1-20231108.
 



Comments from the IVOA Community during RFC/TCG review period: 2023-04-21 - 2023-05-18

The comments from the TCG members during the RFC/TCG review should be included in the next section.

In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your Wiki Name so that authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.

Additional discussion about any of the comments or responses can be conducted on the WG mailing list. However, please be sure to enter your initial comments here for full consideration in any future revisions of this document

Community Comment by Markus Demleitner

(1) The standard ID: I'm pretty sure we discussed that before, but I'm really unsure how we came to the conclusion that even Datalink 1.1 still has the ivoid of ivo://ivoa.net/std/DataLink#links-1.0.

Yes, we have to do crazy stuff like that for the schema URIs due to the way XML element names are compared. But there is in general no analogous need with ivoids, because we control the rules how to compare them in what situations.

Does anyone remember why we went for links-1.0 here? If not, I'd suggest links-1. I volunteer for adding a brief explanation about how clients should disregard the minor version for normal operations.

(2) I am entirely unhappy with section 3.1.1, starting with its title, which probably should be something like "Datalinks in VOTable columns". And then the first paragraph should probably say something more concrete like perhaps "Columns containing datalinks SHOULD be marked with a UCD of X.Y.Z and a LINK-typed child in its FIELD like this:

   <LINK whatever="blabla"/>"

And the second paragraph I'd say doesn't belong here at all (it could go to, perhaps, 1.2.7 or a use case discussing datalinks as primary results if we think we need to be explicit about this).

There are use cases behind this. When datalink links response is hooked to table rows outside the context of ObsTAP /SIA2 how do we generate/recognize the DataLink URL ?

Of course we can use the Service Descriptor with the single ID parameter if the DataLink can be parametrized by and "id" from one of the columns. But in that case the descriptor would be doing exactly the same than the LINK element proposed here as included in the appropriate FIELD and is much less verbose. And it's pretty correct VOTable standard. The FIELD itself should not be described by a datalink ucd because it's probably generally an id.

The second paragraph refers to use cases where the URL is not built from the content of one FIELD and when the URL is ad hoc and should be the content of a FIELD. Using the same utypes than the one used in Obscore responses seems reasonable. This is for example adapted to SIA1 or SSA responses. I think this has nothing to do with recursive datalink.

We may try to rephrase all this if this is unclear, but the intent has to be kept.

-- FrancoisBonnarel - 2023-08-12

(3) In 3.2, is says:

If an error occurs while processing an ID value, there \rfcshould\ be at least one row for that ID value and an error\_message

The way the pyvo datalink client is written, we have to make that an unconditional MUST, or pyvo will keep requesting any failing ID (and frankly I'm unsure how else to implement this given multi-ID and overflows): it will only remove an ID of its list of ids to query if it gets at least one row back for it. Perhaps:

A service MUST return at least one row for each ID passed in.

[ceterum censeo we should have let ID be single-valued; it would have made everything soo much simpler and nothing really much harder/slower]

(4) 3.2.2, second paragraph: I had to puzzle quite a bit about this, starting with wondering what a "dereferenceable URL" might be. I'd suggest to replace the entire paragraph with "Access URLs may have fragment parts, which could, for instance, refer to id-ed elements within XML documents or extensions within FITS files. As in URIs in general, the interpretation of a fragment identifier depends on the media type."

"dereferanceable" was used in the sense that it can be fully accessed by http. Which is not the case in URN in general or URL with fragments. For the latter the client is supposed to interpret the fragment. See: https://en.wikipedia.org/wiki/URI_fragment

Apart from that I agree with your rephrasing. FrancoisBonnarel - 2023-08-14"

This will also drop the "No other additional parameters or client handling are allowed." -- if this forbids query strings on access URIs, I'd strongly disagree. If this means something else, we'd have to write that something else.

In version 1.0 we could read:

"The access_url column contains a URL to download a single resource. The URL in this column must be usable as-is with no additional parameters or client handling; it can be a link to a dynamic resource (e.g. preview generation)."

This statement was not consistent with the allowance of fragments, hence the new statement. I can rephrase it in the upcoming PR. -- FrancoisBonnarel - 2023-08-14

(5) In my editoral PR, I've dropped a paragraph on semantics for error_message rows. This is now sufficiently addressed above that passage.

(6) sect. 3.2.9 content_qualifier: I think we should at least name the motivating use case a bit more precisely here, as in, perhaps: "It aids clients in presenting to the user the same sort of link as they go from one dataset to another within a service. For instance, suppose a service serves both continuum and line cubes. Using content_qualifier, users can configure their clients such that, as they change to a new data set, they always see the line cube even when the semantics and content\_type columns agree for both types of data." Or so.

OK for this change. I will adopt it in the next PR. -- FrancoisBonnarel - 2023-08-14

(7) Sect. 4.8: Sorry, you cannot introduce a utype ("adhoc:this") in a section called "Example: X". If you are really, really sure these "self-describing" things are useful, put them into a section of their own.

Me, I've frankly never really understood where you want to go with this, and I think there's no implementation doing any of this, so perhaps we should drop the whole thing. But if we don't drop it and somehow nonchalantly mention it in an example, at least don't introduce a new utype here. What's wrong with the name="this" you had before? You see, having two different mechanisms for what to my knowledge hasn't been implemented even once seems a bit excessive.

When dropping adhoc:this, don't forget that it is referenced in sect 4.1.

The autodescription motivation may be explained earlier in the section. For "adhoc:this" I remember Pat advocating for this. If we motivate earlier then we can restrict to a pure example here. FrancoisBonnarel - 2023-08-14

(8) I have not looked at the DataLinkImp source that's also present in the repo. If you think this ought to become a document, please extract it to a different repo; ivoatex is not designed to support two documents in one repo.

You are right. The note repo will be created in github.com/ivoa. -- FrancoisBonnarel - 2023-08-18

I've also collected a few rather editorial changes in https://github.com/ivoa-std/DataLink/pull/108



Comments from TCG member during the RFC/TCG Review Period: 2023-04-21 - 2023-05-18

WG chairs or vice chairs must read the Document, provide comments if any (including on topics not directly linked to the Group matters) or indicate that they have no comment.

IG chairs or vice chairs are also encouraged to do the same, althought their inputs are not compulsory.

TCG Chair & Vice Chair

With positive review by the TCG with a comments & feedback period successfully completed, the TCG chair/Vice Chair approve as well.

Applications Working Group

No comment on the document, we appreciate the presence of examples that clarify the usage and implementation

--
Datalink is used and usage will increase for external webservice like simulated data, output format that are not in IVOA (Hapi Timeseries, OGC format ...)

May be change the datalink page with examples of implementation
refer to the datalink page in the document.
encourage working/interest groups to put examples as Markus did

Data Access Layer Working Group

Some minor edits only, otherwise this update looks sound.

  1. Section references would be useful in the v1.1 changes list - PR#105 raised and merged
  2. Some minor grammar updates - PR #107 raised

-- JamesDempsey - 2023-07-03

Data Model Working Group

I have a rather rudimentary understanding of DataLink, VOSI and DALI, so there are some details that I'm glossing over in my read.

I don't see any real issues/conflicts with the DM group work. However, I have 2 points/questions to raise:

  1. local_semantics: This is an identifier from a local vocabulary to help identify/select rows at a finer level than possible with just the other tags (semantics, content_type, content_qualifier). I'm guessing this is for something like ObsCore 's dataproduct_subtype. My question is that I don't really understand what the value is... is it just the tag? or URI for the local vocabulary + tag? The example serializations are no help since the DaCHs ones seem to resolve into a pretty format and I can't see the actual datalink content, and the CADC examples don't populate this field. I'd like to see, either in the document or examples, something more concrete.
I don't think authors discussed this point too much. IMHO both status would be acceptable. simple terms are enough to associate the results, but local vocab URI + tag allow to link the term to definitions and relationships, so it's reacher. Examples wil be given in the text. FrancoisBonnarel - 2023-08-18
  1. Product Type vocabulary: This directly affects the DM group, it'd be used in the Dataset model and ObsCore could be updated to use it as well. The link in the standard resolves to a 2021 version of the vocabulary. At the interop, a 2023 version was discussed which looked like it had some issues. Which vocabulary would support this REC?
    1. 2023-07-28: The referenced vocabulary now resolves to a version dated 2023-06-26 (though the event-list discussion was just going on this week). The elements and definitions in this list appear compatible with DM group usage in the ObsCore and Dataset models. -- MarkCresitelloDittmar - 2023-07-28
Yes, the product-type ivoa vocabulary is what should be used from now onwards in dataset DM, next version of ObsCore as well as DataLink content_qualifier or maybe also registry standards. -- FrancoisBonnarel - 2023-08-18

-- MarkCresitelloDittmar - 2023-06-21

Followup on revised document:

I see the items above have been addressed satisfactorily, I see no additional issues with the revised document.

-- MarkCresitelloDittmar - 2023-11-11

Grid & Web Services Working Group

Possible backward compatibility drawbacks in VOSpace (VOSpace implementation can use a DataLink to reference data location):

  • new columns of VOTable content_qualifier, local_semantics, link_auth and link_authorized (pgg. 15, 16) could break backward compatibility.
  • pg. 17 it is stated "From version 1.1 of this standard the {links} response must include this INFO ....
  • pg.24 in "Example service descriptor for VOSpace 2.0, attributes "datatype" and "arraysize" are added to <PARAM>

    -- SaraBertocco - 2023-11-10

Registry Working Group

No particular remark pertaining to Registry standards.

Semantics Working Group

No issue for Semantics at this point.

Data Curation & Preservation Interest Group

Education Interest Group

Knowledge Discovery Interest Group

Operations Interest Group

I don't strictly speaking speak for Operations IG as of this week, but since I did most of the review before my term expired, I'll fill it in here; the TCG can decide whether this counts as an Ops endorsement or not.

As one of the authors I'm basically happy with this document, but I will draw attention to one or two issues.

  • Section 2.2 defines the standardID for this standard as ivo://ivoa.net/std/DataLink#links-1.0, followed by the comment "Note this is applicable to endpoints following any version 1.* of the DataLink standard, to avoid backward compatibility problems." In my opinion the backward compatibility problems are not sufficient to justify this choice, and the minor version should be reflected in this standard ID, i.e. it should be "...#links-1.1". This has been discussed in the open github Issue #96, and other authors seem to agree. A fix will require at least an update to the StandardsRegExt record, and also changes in the document to places where the key is referenced, including Section 2.2 and, especially, Section 3.3.1 as well as related example text. This change would amongst other things make it possible for validators to check which minor version they are supposed to be validating against. PatDowler has volunteered to write a Pull Request addressing this issue, but I can have a go if he doesn't.
  • Section 3.3.1 REQUIRES an INFO defining a suitable standardID for links response tables. The example shows such an INFO element as a child of the RESOURCE/@type="results" element, but it's not clear what restrictions there are on the location - does it have to go there, or can it be elsewhere in the VOTable? This should be clarified. If it's not required to be a child of the results resource, the example text in this section should probably be cut down.
This is done in consistency with DALI. DALI seems to insist that INFO elements should be in the primary RESOURCE (name="results"), and that other RESOURCEs may be in the VOTable. We may be more explicit on this. see next PR -- FrancoisBonnarel - 2023-08-18
  • Section 3.2.2: The final sentence says "No other additional parameters or client handling are allowed." I don't understand what is meant here. Should this sentence be removed?
See my answer to Markus above. And next PR. -- FrancoisBonnarel - 2023-08-18
  • As mentioned in Issue #82, the recommended MIME type application/x-votable+xml;content=datalink is using a content-type parameter for VOTable not endorsed in the VOTable standard, which is a bit questionable; but this is not new in this version of DataLink and it will hopefully be addressed in VOTable 1.5, so turning a blind eye is probably OK.

-- MarkTaylor - 2023-05-15

Radio Astronomy Interest Group

Solar System Interest Group

Nothing to add.

-- Anne Raugh - 2023-11-10

Theory Interest Group

Time Domain Interest Group

Just one suggestion, perhaps it would be a good idea to explain the notation with braces (e.g. {link}) in the "Conformance-related definitions" section (page 3).

PierreFernique - 2024-11-08

I have added this text to that section (in PR, will nerge before REC):

"This document uses curly braces (e.g. \{name\} to refer to a named concept
such as a web servcie endpoint where the text requires a logical name but
the actual name in a service implementing the standard are not restricted."

-- PatrickDowler 2023-11-11

Standards and Processes Committee


TCG Vote : 2023-05-19 - 2023-06-01

If you have minor comments (typos) on the last version of the document please indicate it in the Comments column of the table and post them in the TCG comments section above with the date.

Group Yes No Abstain Comments
TCG *      
Apps *      
DAL *      
DM *      
GWS *      
Registry *      
Semantics *      
DCP *      
Edu        
KDIG *      
Ops *      
Radio        
SSIG *      
Theory        
TD *      
<nop>StdProc        



<!--
* Set ALLOWTOPICRENAME = TWikiAdminGroup
-->

META FILEATTACHMENT attachment="AladinDesktopDataLink1-1.png" attr="" comment="" date="1681920946" name="AladinDesktopDataLink1-1.png" path="AladinDesktopDataLink1-1.png" size="1238442" user="FrancoisBonnarel" version="1"

Revision 342023-11-12 - SaraBertocco

 
META TOPICPARENT name="IvoaTCG"

DataLink 1.1 Proposed Recommendation: Request for Comments


Introduction


DataLink describes the linking of data discovery metadata to access to the data itself, further detailed metadata, related resources, and to services that perform operations on the data.

The main changes in v1.1 are

  • Generalize by adding use cases for links to content other than data files
  • VOSI-availability and VOSI-capabilities endpoints are now optional
  • Service descriptors can include exampleURL and contentType param(s), as well as DESCRIPTION, name, etc...
  • Added optional link_auth and link_authorized to signal whether authentication is necessary to use the link
  • INFO element with standardID mandatory in {links} response
  • Added content_qualifier FIELD to inform on the nature of the link target
  • Added local_semantics to identify similar links in the same DataLink service for different IDs
  • Mechanisms to recognize {links} endpoints outside ObsCore

Latest version of DataLink can be found at:

The GitHub repository for issues and source can be found at: Detailed discussion towards 1-1 can also be found on this ivoa twiki page ( last update by FrancoisBonnarel - 2023-05-04):

Reference Interoperable Implementations

Server side

GAVO implemented the following changes = content_qualifier and local_semantics, service descriptor additional features such as DESCRIPTION? name, content_type, etc....

As a matter of example, a couple of links response for various obscore/sia/ssa tables in GAVO server

All these are combined with various semantics or content_type values

-- MarkusDemleitner - 2023-07-10


CADC has implemented the following in https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops:

  • INFO element with standardID in links response
  • new optional fields in links response: local_semantics (no content yet but can be populated with default vocab in most cases), content_qualifier (no content, not likely to use), link_auth, link_authorized
  • contentType param in service descriptors (where applicable)
IRIS image: https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/IRIS?f212h000/IRAS-25um shows link_auth=optional and link_authorized=true because one can authenticate but the data is public.

new CFHT data: anonymous use of https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/CFHT?2773629/2773629o shows link_auth=optional and link_authorized=false because the data is still proprietary and the caller is anonymous; if an authorized user makes the call they will see authorized=true. It's hard to demonstrate that for a general audience.

The core CADC implementation is available as a library (cadc-datalink-server) in MavenCentral with source code at https://github.com/opencadc/dal.git; the caom2-specific logic is available in a library (caom2-datalink-server) with source at https://github.com/opencadc/caom2service.git -- the core lib is also used in ALMA DataLink service but may not yet be released with the latest features.

Client side

TOPCAT v4.8-8 and later displays additional features in service descriptor and makes use of additional links table FIELD such as content_qualifier, authorization and local_semantics; The tool behavior is adapted to the content of these new FIELDS. For example Activation Actions suggested by topcat for the links not only depend on content_type but also content_qualifier, and local_semantics is used to guess which link a user is interested in based on previous selections.

AladinDesktop is going to adapt to those new features too. (see prototype screenshot)

The CADC DownloadManager (https://github.com/opencadc/apps.git) includes a simple DataLink client class so it can resolve publisherID values into 1..* URLs for download; this code hasn't changed as a result of DataLink -1.1. The CADC AdvancedSearch web portal makes calls to the above caom2ops/datalink service to find previews and download info for each row (publisherID): it makes use of link_authorized to decide to display the download options (or not), which prevents users from selecting downloads/links when they are not authorized and the request will be rejected later.

Implementations Validators

The following validators are available for DataLink

  • datalinklint which is part of STILTS. From STILTS v3.4-8 there is a version parameter that can be set to 1.0 or 1.1 (effectively defaults to v1.1)
  • Show my DataLink which is part of DaCHS



Comments from the IVOA Community during RFC/TCG review period: 2023-04-21 - 2023-05-18

The comments from the TCG members during the RFC/TCG review should be included in the next section.

In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your Wiki Name so that authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.

Additional discussion about any of the comments or responses can be conducted on the WG mailing list. However, please be sure to enter your initial comments here for full consideration in any future revisions of this document

Community Comment by Markus Demleitner

(1) The standard ID: I'm pretty sure we discussed that before, but I'm really unsure how we came to the conclusion that even Datalink 1.1 still has the ivoid of ivo://ivoa.net/std/DataLink#links-1.0.

Yes, we have to do crazy stuff like that for the schema URIs due to the way XML element names are compared. But there is in general no analogous need with ivoids, because we control the rules how to compare them in what situations.

Does anyone remember why we went for links-1.0 here? If not, I'd suggest links-1. I volunteer for adding a brief explanation about how clients should disregard the minor version for normal operations.

(2) I am entirely unhappy with section 3.1.1, starting with its title, which probably should be something like "Datalinks in VOTable columns". And then the first paragraph should probably say something more concrete like perhaps "Columns containing datalinks SHOULD be marked with a UCD of X.Y.Z and a LINK-typed child in its FIELD like this:

Added:
>
>
 
   <LINK whatever="blabla"/>"

And the second paragraph I'd say doesn't belong here at all (it could go to, perhaps, 1.2.7 or a use case discussing datalinks as primary results if we think we need to be explicit about this).

There are use cases behind this. When datalink links response is hooked to table rows outside the context of ObsTAP /SIA2 how do we generate/recognize the DataLink URL ?

Of course we can use the Service Descriptor with the single ID parameter if the DataLink can be parametrized by and "id" from one of the columns. But in that case the descriptor would be doing exactly the same than the LINK element proposed here as included in the appropriate FIELD and is much less verbose. And it's pretty correct VOTable standard. The FIELD itself should not be described by a datalink ucd because it's probably generally an id.

The second paragraph refers to use cases where the URL is not built from the content of one FIELD and when the URL is ad hoc and should be the content of a FIELD. Using the same utypes than the one used in Obscore responses seems reasonable. This is for example adapted to SIA1 or SSA responses. I think this has nothing to do with recursive datalink.

We may try to rephrase all this if this is unclear, but the intent has to be kept.

-- FrancoisBonnarel - 2023-08-12

(3) In 3.2, is says:

If an error occurs while processing an ID value, there \rfcshould\ be at least one row for that ID value and an error\_message

The way the pyvo datalink client is written, we have to make that an unconditional MUST, or pyvo will keep requesting any failing ID (and frankly I'm unsure how else to implement this given multi-ID and overflows): it will only remove an ID of its list of ids to query if it gets at least one row back for it. Perhaps:

A service MUST return at least one row for each ID passed in.

[ceterum censeo we should have let ID be single-valued; it would have made everything soo much simpler and nothing really much harder/slower]

(4) 3.2.2, second paragraph: I had to puzzle quite a bit about this, starting with wondering what a "dereferenceable URL" might be. I'd suggest to replace the entire paragraph with "Access URLs may have fragment parts, which could, for instance, refer to id-ed elements within XML documents or extensions within FITS files. As in URIs in general, the interpretation of a fragment identifier depends on the media type."

"dereferanceable" was used in the sense that it can be fully accessed by http. Which is not the case in URN in general or URL with fragments. For the latter the client is supposed to interpret the fragment. See: https://en.wikipedia.org/wiki/URI_fragment

Apart from that I agree with your rephrasing. FrancoisBonnarel - 2023-08-14"

This will also drop the "No other additional parameters or client handling are allowed." -- if this forbids query strings on access URIs, I'd strongly disagree. If this means something else, we'd have to write that something else.

In version 1.0 we could read:

"The access_url column contains a URL to download a single resource. The URL in this column must be usable as-is with no additional parameters or client handling; it can be a link to a dynamic resource (e.g. preview generation)."

This statement was not consistent with the allowance of fragments, hence the new statement. I can rephrase it in the upcoming PR. -- FrancoisBonnarel - 2023-08-14

(5) In my editoral PR, I've dropped a paragraph on semantics for error_message rows. This is now sufficiently addressed above that passage.

(6) sect. 3.2.9 content_qualifier: I think we should at least name the motivating use case a bit more precisely here, as in, perhaps: "It aids clients in presenting to the user the same sort of link as they go from one dataset to another within a service. For instance, suppose a service serves both continuum and line cubes. Using content_qualifier, users can configure their clients such that, as they change to a new data set, they always see the line cube even when the semantics and content\_type columns agree for both types of data." Or so.

OK for this change. I will adopt it in the next PR. -- FrancoisBonnarel - 2023-08-14

(7) Sect. 4.8: Sorry, you cannot introduce a utype ("adhoc:this") in a section called "Example: X". If you are really, really sure these "self-describing" things are useful, put them into a section of their own.

Me, I've frankly never really understood where you want to go with this, and I think there's no implementation doing any of this, so perhaps we should drop the whole thing. But if we don't drop it and somehow nonchalantly mention it in an example, at least don't introduce a new utype here. What's wrong with the name="this" you had before? You see, having two different mechanisms for what to my knowledge hasn't been implemented even once seems a bit excessive.

When dropping adhoc:this, don't forget that it is referenced in sect 4.1.

The autodescription motivation may be explained earlier in the section. For "adhoc:this" I remember Pat advocating for this. If we motivate earlier then we can restrict to a pure example here. FrancoisBonnarel - 2023-08-14

(8) I have not looked at the DataLinkImp source that's also present in the repo. If you think this ought to become a document, please extract it to a different repo; ivoatex is not designed to support two documents in one repo.

You are right. The note repo will be created in github.com/ivoa. -- FrancoisBonnarel - 2023-08-18

I've also collected a few rather editorial changes in https://github.com/ivoa-std/DataLink/pull/108



Comments from TCG member during the RFC/TCG Review Period: 2023-04-21 - 2023-05-18

WG chairs or vice chairs must read the Document, provide comments if any (including on topics not directly linked to the Group matters) or indicate that they have no comment.

IG chairs or vice chairs are also encouraged to do the same, althought their inputs are not compulsory.

TCG Chair & Vice Chair

With positive review by the TCG with a comments & feedback period successfully completed, the TCG chair/Vice Chair approve as well.

Applications Working Group

No comment on the document, we appreciate the presence of examples that clarify the usage and implementation

--
Datalink is used and usage will increase for external webservice like simulated data, output format that are not in IVOA (Hapi Timeseries, OGC format ...)

May be change the datalink page with examples of implementation
refer to the datalink page in the document.
encourage working/interest groups to put examples as Markus did

Data Access Layer Working Group

Some minor edits only, otherwise this update looks sound.

  1. Section references would be useful in the v1.1 changes list - PR#105 raised and merged
  2. Some minor grammar updates - PR #107 raised

-- JamesDempsey - 2023-07-03

Data Model Working Group

I have a rather rudimentary understanding of DataLink, VOSI and DALI, so there are some details that I'm glossing over in my read.

I don't see any real issues/conflicts with the DM group work. However, I have 2 points/questions to raise:

  1. local_semantics: This is an identifier from a local vocabulary to help identify/select rows at a finer level than possible with just the other tags (semantics, content_type, content_qualifier). I'm guessing this is for something like ObsCore 's dataproduct_subtype. My question is that I don't really understand what the value is... is it just the tag? or URI for the local vocabulary + tag? The example serializations are no help since the DaCHs ones seem to resolve into a pretty format and I can't see the actual datalink content, and the CADC examples don't populate this field. I'd like to see, either in the document or examples, something more concrete.
I don't think authors discussed this point too much. IMHO both status would be acceptable. simple terms are enough to associate the results, but local vocab URI + tag allow to link the term to definitions and relationships, so it's reacher. Examples wil be given in the text. FrancoisBonnarel - 2023-08-18
  1. Product Type vocabulary: This directly affects the DM group, it'd be used in the Dataset model and ObsCore could be updated to use it as well. The link in the standard resolves to a 2021 version of the vocabulary. At the interop, a 2023 version was discussed which looked like it had some issues. Which vocabulary would support this REC?
    1. 2023-07-28: The referenced vocabulary now resolves to a version dated 2023-06-26 (though the event-list discussion was just going on this week). The elements and definitions in this list appear compatible with DM group usage in the ObsCore and Dataset models. -- MarkCresitelloDittmar - 2023-07-28
Yes, the product-type ivoa vocabulary is what should be used from now onwards in dataset DM, next version of ObsCore as well as DataLink content_qualifier or maybe also registry standards. -- FrancoisBonnarel - 2023-08-18

-- MarkCresitelloDittmar - 2023-06-21

Followup on revised document:

I see the items above have been addressed satisfactorily, I see no additional issues with the revised document.

-- MarkCresitelloDittmar - 2023-11-11

Grid & Web Services Working Group

Possible backward compatibility drawbacks in VOSpace (VOSpace implementation can use a DataLink to reference data location):

  • new columns of VOTable content_qualifier, local_semantics, link_auth and link_authorized (pgg. 15, 16) could break backward compatibility.
  • pg. 17 it is stated "From version 1.1 of this standard the {links} response must include this INFO ....
  • pg.24 in "Example service descriptor for VOSpace 2.0, attributes "datatype" and "arraysize" are added to <PARAM>

    -- SaraBertocco - 2023-11-10

Registry Working Group

No particular remark pertaining to Registry standards.

Semantics Working Group

No issue for Semantics at this point.

Data Curation & Preservation Interest Group

Education Interest Group

Knowledge Discovery Interest Group

Operations Interest Group

I don't strictly speaking speak for Operations IG as of this week, but since I did most of the review before my term expired, I'll fill it in here; the TCG can decide whether this counts as an Ops endorsement or not.

As one of the authors I'm basically happy with this document, but I will draw attention to one or two issues.

  • Section 2.2 defines the standardID for this standard as ivo://ivoa.net/std/DataLink#links-1.0, followed by the comment "Note this is applicable to endpoints following any version 1.* of the DataLink standard, to avoid backward compatibility problems." In my opinion the backward compatibility problems are not sufficient to justify this choice, and the minor version should be reflected in this standard ID, i.e. it should be "...#links-1.1". This has been discussed in the open github Issue #96, and other authors seem to agree. A fix will require at least an update to the StandardsRegExt record, and also changes in the document to places where the key is referenced, including Section 2.2 and, especially, Section 3.3.1 as well as related example text. This change would amongst other things make it possible for validators to check which minor version they are supposed to be validating against. PatDowler has volunteered to write a Pull Request addressing this issue, but I can have a go if he doesn't.
  • Section 3.3.1 REQUIRES an INFO defining a suitable standardID for links response tables. The example shows such an INFO element as a child of the RESOURCE/@type="results" element, but it's not clear what restrictions there are on the location - does it have to go there, or can it be elsewhere in the VOTable? This should be clarified. If it's not required to be a child of the results resource, the example text in this section should probably be cut down.
This is done in consistency with DALI. DALI seems to insist that INFO elements should be in the primary RESOURCE (name="results"), and that other RESOURCEs may be in the VOTable. We may be more explicit on this. see next PR -- FrancoisBonnarel - 2023-08-18
  • Section 3.2.2: The final sentence says "No other additional parameters or client handling are allowed." I don't understand what is meant here. Should this sentence be removed?
See my answer to Markus above. And next PR. -- FrancoisBonnarel - 2023-08-18
  • As mentioned in Issue #82, the recommended MIME type application/x-votable+xml;content=datalink is using a content-type parameter for VOTable not endorsed in the VOTable standard, which is a bit questionable; but this is not new in this version of DataLink and it will hopefully be addressed in VOTable 1.5, so turning a blind eye is probably OK.

-- MarkTaylor - 2023-05-15

Radio Astronomy Interest Group

Solar System Interest Group

Nothing to add.

-- Anne Raugh - 2023-11-10

Theory Interest Group

Time Domain Interest Group

Just one suggestion, perhaps it would be a good idea to explain the notation with braces (e.g. {link}) in the "Conformance-related definitions" section (page 3).

PierreFernique - 2024-11-08

I have added this text to that section (in PR, will nerge before REC):

"This document uses curly braces (e.g. \{name\} to refer to a named concept
such as a web servcie endpoint where the text requires a logical name but
the actual name in a service implementing the standard are not restricted."

-- PatrickDowler 2023-11-11

Standards and Processes Committee


TCG Vote : 2023-05-19 - 2023-06-01

If you have minor comments (typos) on the last version of the document please indicate it in the Comments column of the table and post them in the TCG comments section above with the date.

Group Yes No Abstain Comments
TCG *      
Apps *      
DAL *      
DM *      
Changed:
<
<
GWS        
>
>
GWS *      
 
Registry *      
Semantics *      
DCP *      
Edu        
KDIG *      
Ops *      
Radio        
SSIG *      
Theory        
TD *      
<nop>StdProc        



<!--
* Set ALLOWTOPICRENAME = TWikiAdminGroup
-->

META FILEATTACHMENT attachment="AladinDesktopDataLink1-1.png" attr="" comment="" date="1681920946" name="AladinDesktopDataLink1-1.png" path="AladinDesktopDataLink1-1.png" size="1238442" user="FrancoisBonnarel" version="1"

Revision 332023-11-12 - JanetEvans

 
META TOPICPARENT name="IvoaTCG"

DataLink 1.1 Proposed Recommendation: Request for Comments


Introduction


DataLink describes the linking of data discovery metadata to access to the data itself, further detailed metadata, related resources, and to services that perform operations on the data.

The main changes in v1.1 are

  • Generalize by adding use cases for links to content other than data files
  • VOSI-availability and VOSI-capabilities endpoints are now optional
  • Service descriptors can include exampleURL and contentType param(s), as well as DESCRIPTION, name, etc...
  • Added optional link_auth and link_authorized to signal whether authentication is necessary to use the link
  • INFO element with standardID mandatory in {links} response
  • Added content_qualifier FIELD to inform on the nature of the link target
  • Added local_semantics to identify similar links in the same DataLink service for different IDs
  • Mechanisms to recognize {links} endpoints outside ObsCore

Latest version of DataLink can be found at:

The GitHub repository for issues and source can be found at: Detailed discussion towards 1-1 can also be found on this ivoa twiki page ( last update by FrancoisBonnarel - 2023-05-04):

Reference Interoperable Implementations

Server side

GAVO implemented the following changes = content_qualifier and local_semantics, service descriptor additional features such as DESCRIPTION? name, content_type, etc....

As a matter of example, a couple of links response for various obscore/sia/ssa tables in GAVO server

All these are combined with various semantics or content_type values

-- MarkusDemleitner - 2023-07-10


CADC has implemented the following in https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops:

  • INFO element with standardID in links response
  • new optional fields in links response: local_semantics (no content yet but can be populated with default vocab in most cases), content_qualifier (no content, not likely to use), link_auth, link_authorized
  • contentType param in service descriptors (where applicable)
IRIS image: https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/IRIS?f212h000/IRAS-25um shows link_auth=optional and link_authorized=true because one can authenticate but the data is public.

new CFHT data: anonymous use of https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/CFHT?2773629/2773629o shows link_auth=optional and link_authorized=false because the data is still proprietary and the caller is anonymous; if an authorized user makes the call they will see authorized=true. It's hard to demonstrate that for a general audience.

The core CADC implementation is available as a library (cadc-datalink-server) in MavenCentral with source code at https://github.com/opencadc/dal.git; the caom2-specific logic is available in a library (caom2-datalink-server) with source at https://github.com/opencadc/caom2service.git -- the core lib is also used in ALMA DataLink service but may not yet be released with the latest features.

Client side

TOPCAT v4.8-8 and later displays additional features in service descriptor and makes use of additional links table FIELD such as content_qualifier, authorization and local_semantics; The tool behavior is adapted to the content of these new FIELDS. For example Activation Actions suggested by topcat for the links not only depend on content_type but also content_qualifier, and local_semantics is used to guess which link a user is interested in based on previous selections.

AladinDesktop is going to adapt to those new features too. (see prototype screenshot)

The CADC DownloadManager (https://github.com/opencadc/apps.git) includes a simple DataLink client class so it can resolve publisherID values into 1..* URLs for download; this code hasn't changed as a result of DataLink -1.1. The CADC AdvancedSearch web portal makes calls to the above caom2ops/datalink service to find previews and download info for each row (publisherID): it makes use of link_authorized to decide to display the download options (or not), which prevents users from selecting downloads/links when they are not authorized and the request will be rejected later.

Implementations Validators

The following validators are available for DataLink

  • datalinklint which is part of STILTS. From STILTS v3.4-8 there is a version parameter that can be set to 1.0 or 1.1 (effectively defaults to v1.1)
  • Show my DataLink which is part of DaCHS



Comments from the IVOA Community during RFC/TCG review period: 2023-04-21 - 2023-05-18

The comments from the TCG members during the RFC/TCG review should be included in the next section.

In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your Wiki Name so that authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.

Additional discussion about any of the comments or responses can be conducted on the WG mailing list. However, please be sure to enter your initial comments here for full consideration in any future revisions of this document

Community Comment by Markus Demleitner

(1) The standard ID: I'm pretty sure we discussed that before, but I'm really unsure how we came to the conclusion that even Datalink 1.1 still has the ivoid of ivo://ivoa.net/std/DataLink#links-1.0.

Yes, we have to do crazy stuff like that for the schema URIs due to the way XML element names are compared. But there is in general no analogous need with ivoids, because we control the rules how to compare them in what situations.

Does anyone remember why we went for links-1.0 here? If not, I'd suggest links-1. I volunteer for adding a brief explanation about how clients should disregard the minor version for normal operations.

(2) I am entirely unhappy with section 3.1.1, starting with its title, which probably should be something like "Datalinks in VOTable columns". And then the first paragraph should probably say something more concrete like perhaps "Columns containing datalinks SHOULD be marked with a UCD of X.Y.Z and a LINK-typed child in its FIELD like this:

Deleted:
<
<
 
   <LINK whatever="blabla"/>"

And the second paragraph I'd say doesn't belong here at all (it could go to, perhaps, 1.2.7 or a use case discussing datalinks as primary results if we think we need to be explicit about this).

There are use cases behind this. When datalink links response is hooked to table rows outside the context of ObsTAP /SIA2 how do we generate/recognize the DataLink URL ?

Of course we can use the Service Descriptor with the single ID parameter if the DataLink can be parametrized by and "id" from one of the columns. But in that case the descriptor would be doing exactly the same than the LINK element proposed here as included in the appropriate FIELD and is much less verbose. And it's pretty correct VOTable standard. The FIELD itself should not be described by a datalink ucd because it's probably generally an id.

The second paragraph refers to use cases where the URL is not built from the content of one FIELD and when the URL is ad hoc and should be the content of a FIELD. Using the same utypes than the one used in Obscore responses seems reasonable. This is for example adapted to SIA1 or SSA responses. I think this has nothing to do with recursive datalink.

We may try to rephrase all this if this is unclear, but the intent has to be kept.

-- FrancoisBonnarel - 2023-08-12

(3) In 3.2, is says:

If an error occurs while processing an ID value, there \rfcshould\ be at least one row for that ID value and an error\_message

The way the pyvo datalink client is written, we have to make that an unconditional MUST, or pyvo will keep requesting any failing ID (and frankly I'm unsure how else to implement this given multi-ID and overflows): it will only remove an ID of its list of ids to query if it gets at least one row back for it. Perhaps:

A service MUST return at least one row for each ID passed in.

[ceterum censeo we should have let ID be single-valued; it would have made everything soo much simpler and nothing really much harder/slower]

(4) 3.2.2, second paragraph: I had to puzzle quite a bit about this, starting with wondering what a "dereferenceable URL" might be. I'd suggest to replace the entire paragraph with "Access URLs may have fragment parts, which could, for instance, refer to id-ed elements within XML documents or extensions within FITS files. As in URIs in general, the interpretation of a fragment identifier depends on the media type."

"dereferanceable" was used in the sense that it can be fully accessed by http. Which is not the case in URN in general or URL with fragments. For the latter the client is supposed to interpret the fragment. See: https://en.wikipedia.org/wiki/URI_fragment

Apart from that I agree with your rephrasing. FrancoisBonnarel - 2023-08-14"

This will also drop the "No other additional parameters or client handling are allowed." -- if this forbids query strings on access URIs, I'd strongly disagree. If this means something else, we'd have to write that something else.

In version 1.0 we could read:

"The access_url column contains a URL to download a single resource. The URL in this column must be usable as-is with no additional parameters or client handling; it can be a link to a dynamic resource (e.g. preview generation)."

This statement was not consistent with the allowance of fragments, hence the new statement. I can rephrase it in the upcoming PR. -- FrancoisBonnarel - 2023-08-14

(5) In my editoral PR, I've dropped a paragraph on semantics for error_message rows. This is now sufficiently addressed above that passage.

(6) sect. 3.2.9 content_qualifier: I think we should at least name the motivating use case a bit more precisely here, as in, perhaps: "It aids clients in presenting to the user the same sort of link as they go from one dataset to another within a service. For instance, suppose a service serves both continuum and line cubes. Using content_qualifier, users can configure their clients such that, as they change to a new data set, they always see the line cube even when the semantics and content\_type columns agree for both types of data." Or so.

OK for this change. I will adopt it in the next PR. -- FrancoisBonnarel - 2023-08-14

(7) Sect. 4.8: Sorry, you cannot introduce a utype ("adhoc:this") in a section called "Example: X". If you are really, really sure these "self-describing" things are useful, put them into a section of their own.

Me, I've frankly never really understood where you want to go with this, and I think there's no implementation doing any of this, so perhaps we should drop the whole thing. But if we don't drop it and somehow nonchalantly mention it in an example, at least don't introduce a new utype here. What's wrong with the name="this" you had before? You see, having two different mechanisms for what to my knowledge hasn't been implemented even once seems a bit excessive.

When dropping adhoc:this, don't forget that it is referenced in sect 4.1.

The autodescription motivation may be explained earlier in the section. For "adhoc:this" I remember Pat advocating for this. If we motivate earlier then we can restrict to a pure example here. FrancoisBonnarel - 2023-08-14

(8) I have not looked at the DataLinkImp source that's also present in the repo. If you think this ought to become a document, please extract it to a different repo; ivoatex is not designed to support two documents in one repo.

You are right. The note repo will be created in github.com/ivoa. -- FrancoisBonnarel - 2023-08-18

I've also collected a few rather editorial changes in https://github.com/ivoa-std/DataLink/pull/108



Comments from TCG member during the RFC/TCG Review Period: 2023-04-21 - 2023-05-18

WG chairs or vice chairs must read the Document, provide comments if any (including on topics not directly linked to the Group matters) or indicate that they have no comment.

IG chairs or vice chairs are also encouraged to do the same, althought their inputs are not compulsory.

TCG Chair & Vice Chair

Added:
>
>
With positive review by the TCG with a comments & feedback period successfully completed, the TCG chair/Vice Chair approve as well.
 

Applications Working Group

No comment on the document, we appreciate the presence of examples that clarify the usage and implementation

--
Datalink is used and usage will increase for external webservice like simulated data, output format that are not in IVOA (Hapi Timeseries, OGC format ...)

May be change the datalink page with examples of implementation
refer to the datalink page in the document.
encourage working/interest groups to put examples as Markus did

Data Access Layer Working Group

Some minor edits only, otherwise this update looks sound.

  1. Section references would be useful in the v1.1 changes list - PR#105 raised and merged
  2. Some minor grammar updates - PR #107 raised

-- JamesDempsey - 2023-07-03

Data Model Working Group

I have a rather rudimentary understanding of DataLink, VOSI and DALI, so there are some details that I'm glossing over in my read.

I don't see any real issues/conflicts with the DM group work. However, I have 2 points/questions to raise:

  1. local_semantics: This is an identifier from a local vocabulary to help identify/select rows at a finer level than possible with just the other tags (semantics, content_type, content_qualifier). I'm guessing this is for something like ObsCore 's dataproduct_subtype. My question is that I don't really understand what the value is... is it just the tag? or URI for the local vocabulary + tag? The example serializations are no help since the DaCHs ones seem to resolve into a pretty format and I can't see the actual datalink content, and the CADC examples don't populate this field. I'd like to see, either in the document or examples, something more concrete.
I don't think authors discussed this point too much. IMHO both status would be acceptable. simple terms are enough to associate the results, but local vocab URI + tag allow to link the term to definitions and relationships, so it's reacher. Examples wil be given in the text. FrancoisBonnarel - 2023-08-18
  1. Product Type vocabulary: This directly affects the DM group, it'd be used in the Dataset model and ObsCore could be updated to use it as well. The link in the standard resolves to a 2021 version of the vocabulary. At the interop, a 2023 version was discussed which looked like it had some issues. Which vocabulary would support this REC?
    1. 2023-07-28: The referenced vocabulary now resolves to a version dated 2023-06-26 (though the event-list discussion was just going on this week). The elements and definitions in this list appear compatible with DM group usage in the ObsCore and Dataset models. -- MarkCresitelloDittmar - 2023-07-28
Yes, the product-type ivoa vocabulary is what should be used from now onwards in dataset DM, next version of ObsCore as well as DataLink content_qualifier or maybe also registry standards. -- FrancoisBonnarel - 2023-08-18

-- MarkCresitelloDittmar - 2023-06-21

Followup on revised document:

I see the items above have been addressed satisfactorily, I see no additional issues with the revised document.

-- MarkCresitelloDittmar - 2023-11-11

Grid & Web Services Working Group

Possible backward compatibility drawbacks in VOSpace (VOSpace implementation can use a DataLink to reference data location):

  • new columns of VOTable content_qualifier, local_semantics, link_auth and link_authorized (pgg. 15, 16) could break backward compatibility.
  • pg. 17 it is stated "From version 1.1 of this standard the {links} response must include this INFO ....
  • pg.24 in "Example service descriptor for VOSpace 2.0, attributes "datatype" and "arraysize" are added to <PARAM>

    -- SaraBertocco - 2023-11-10

Registry Working Group

No particular remark pertaining to Registry standards.

Semantics Working Group

No issue for Semantics at this point.

Data Curation & Preservation Interest Group

Education Interest Group

Knowledge Discovery Interest Group

Operations Interest Group

I don't strictly speaking speak for Operations IG as of this week, but since I did most of the review before my term expired, I'll fill it in here; the TCG can decide whether this counts as an Ops endorsement or not.

As one of the authors I'm basically happy with this document, but I will draw attention to one or two issues.

  • Section 2.2 defines the standardID for this standard as ivo://ivoa.net/std/DataLink#links-1.0, followed by the comment "Note this is applicable to endpoints following any version 1.* of the DataLink standard, to avoid backward compatibility problems." In my opinion the backward compatibility problems are not sufficient to justify this choice, and the minor version should be reflected in this standard ID, i.e. it should be "...#links-1.1". This has been discussed in the open github Issue #96, and other authors seem to agree. A fix will require at least an update to the StandardsRegExt record, and also changes in the document to places where the key is referenced, including Section 2.2 and, especially, Section 3.3.1 as well as related example text. This change would amongst other things make it possible for validators to check which minor version they are supposed to be validating against. PatDowler has volunteered to write a Pull Request addressing this issue, but I can have a go if he doesn't.
  • Section 3.3.1 REQUIRES an INFO defining a suitable standardID for links response tables. The example shows such an INFO element as a child of the RESOURCE/@type="results" element, but it's not clear what restrictions there are on the location - does it have to go there, or can it be elsewhere in the VOTable? This should be clarified. If it's not required to be a child of the results resource, the example text in this section should probably be cut down.
This is done in consistency with DALI. DALI seems to insist that INFO elements should be in the primary RESOURCE (name="results"), and that other RESOURCEs may be in the VOTable. We may be more explicit on this. see next PR -- FrancoisBonnarel - 2023-08-18
  • Section 3.2.2: The final sentence says "No other additional parameters or client handling are allowed." I don't understand what is meant here. Should this sentence be removed?
See my answer to Markus above. And next PR. -- FrancoisBonnarel - 2023-08-18
  • As mentioned in Issue #82, the recommended MIME type application/x-votable+xml;content=datalink is using a content-type parameter for VOTable not endorsed in the VOTable standard, which is a bit questionable; but this is not new in this version of DataLink and it will hopefully be addressed in VOTable 1.5, so turning a blind eye is probably OK.

-- MarkTaylor - 2023-05-15

Radio Astronomy Interest Group

Solar System Interest Group

Nothing to add.

-- Anne Raugh - 2023-11-10

Theory Interest Group

Time Domain Interest Group

Just one suggestion, perhaps it would be a good idea to explain the notation with braces (e.g. {link}) in the "Conformance-related definitions" section (page 3).

PierreFernique - 2024-11-08

I have added this text to that section (in PR, will nerge before REC):

"This document uses curly braces (e.g. \{name\} to refer to a named concept
such as a web servcie endpoint where the text requires a logical name but
the actual name in a service implementing the standard are not restricted."

-- PatrickDowler 2023-11-11

Standards and Processes Committee


TCG Vote : 2023-05-19 - 2023-06-01

If you have minor comments (typos) on the last version of the document please indicate it in the Comments column of the table and post them in the TCG comments section above with the date.

Group Yes No Abstain Comments
Changed:
<
<
TCG        
>
>
TCG *      
 
Apps *      
DAL *      
DM *      
GWS        
Registry *      
Semantics *      
DCP *      
Edu        
KDIG *      
Ops *      
Radio        
SSIG *      
Theory        
TD *      
<nop>StdProc        



<!--
* Set ALLOWTOPICRENAME = TWikiAdminGroup
-->

META FILEATTACHMENT attachment="AladinDesktopDataLink1-1.png" attr="" comment="" date="1681920946" name="AladinDesktopDataLink1-1.png" path="AladinDesktopDataLink1-1.png" size="1238442" user="FrancoisBonnarel" version="1"

Revision 322023-11-11 - PatrickDowler

 
META TOPICPARENT name="IvoaTCG"

DataLink 1.1 Proposed Recommendation: Request for Comments


Introduction


DataLink describes the linking of data discovery metadata to access to the data itself, further detailed metadata, related resources, and to services that perform operations on the data.

The main changes in v1.1 are

  • Generalize by adding use cases for links to content other than data files
  • VOSI-availability and VOSI-capabilities endpoints are now optional
  • Service descriptors can include exampleURL and contentType param(s), as well as DESCRIPTION, name, etc...
  • Added optional link_auth and link_authorized to signal whether authentication is necessary to use the link
  • INFO element with standardID mandatory in {links} response
  • Added content_qualifier FIELD to inform on the nature of the link target
  • Added local_semantics to identify similar links in the same DataLink service for different IDs
  • Mechanisms to recognize {links} endpoints outside ObsCore

Latest version of DataLink can be found at:

The GitHub repository for issues and source can be found at: Detailed discussion towards 1-1 can also be found on this ivoa twiki page ( last update by FrancoisBonnarel - 2023-05-04):

Reference Interoperable Implementations

Server side

GAVO implemented the following changes = content_qualifier and local_semantics, service descriptor additional features such as DESCRIPTION? name, content_type, etc....

As a matter of example, a couple of links response for various obscore/sia/ssa tables in GAVO server

All these are combined with various semantics or content_type values

-- MarkusDemleitner - 2023-07-10


CADC has implemented the following in https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops:

  • INFO element with standardID in links response
  • new optional fields in links response: local_semantics (no content yet but can be populated with default vocab in most cases), content_qualifier (no content, not likely to use), link_auth, link_authorized
  • contentType param in service descriptors (where applicable)
IRIS image: https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/IRIS?f212h000/IRAS-25um shows link_auth=optional and link_authorized=true because one can authenticate but the data is public.

new CFHT data: anonymous use of https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/CFHT?2773629/2773629o shows link_auth=optional and link_authorized=false because the data is still proprietary and the caller is anonymous; if an authorized user makes the call they will see authorized=true. It's hard to demonstrate that for a general audience.

The core CADC implementation is available as a library (cadc-datalink-server) in MavenCentral with source code at https://github.com/opencadc/dal.git; the caom2-specific logic is available in a library (caom2-datalink-server) with source at https://github.com/opencadc/caom2service.git -- the core lib is also used in ALMA DataLink service but may not yet be released with the latest features.

Client side

TOPCAT v4.8-8 and later displays additional features in service descriptor and makes use of additional links table FIELD such as content_qualifier, authorization and local_semantics; The tool behavior is adapted to the content of these new FIELDS. For example Activation Actions suggested by topcat for the links not only depend on content_type but also content_qualifier, and local_semantics is used to guess which link a user is interested in based on previous selections.

AladinDesktop is going to adapt to those new features too. (see prototype screenshot)

The CADC DownloadManager (https://github.com/opencadc/apps.git) includes a simple DataLink client class so it can resolve publisherID values into 1..* URLs for download; this code hasn't changed as a result of DataLink -1.1. The CADC AdvancedSearch web portal makes calls to the above caom2ops/datalink service to find previews and download info for each row (publisherID): it makes use of link_authorized to decide to display the download options (or not), which prevents users from selecting downloads/links when they are not authorized and the request will be rejected later.

Implementations Validators

The following validators are available for DataLink

  • datalinklint which is part of STILTS. From STILTS v3.4-8 there is a version parameter that can be set to 1.0 or 1.1 (effectively defaults to v1.1)
  • Show my DataLink which is part of DaCHS



Comments from the IVOA Community during RFC/TCG review period: 2023-04-21 - 2023-05-18

The comments from the TCG members during the RFC/TCG review should be included in the next section.

In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your Wiki Name so that authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.

Additional discussion about any of the comments or responses can be conducted on the WG mailing list. However, please be sure to enter your initial comments here for full consideration in any future revisions of this document

Community Comment by Markus Demleitner

(1) The standard ID: I'm pretty sure we discussed that before, but I'm really unsure how we came to the conclusion that even Datalink 1.1 still has the ivoid of ivo://ivoa.net/std/DataLink#links-1.0.

Yes, we have to do crazy stuff like that for the schema URIs due to the way XML element names are compared. But there is in general no analogous need with ivoids, because we control the rules how to compare them in what situations.

Does anyone remember why we went for links-1.0 here? If not, I'd suggest links-1. I volunteer for adding a brief explanation about how clients should disregard the minor version for normal operations.

Changed:
<
<
(2) I am entirely unhappy with section 3.1.1, starting with its title, which probably should be something like "Datalinks in VOTable columns". And then the first paragraph should probably say something more concrete like perhaps "Columns containing datalinks SHOULD be marked with a UCD of X.Y.Z and a LINK-typed child in its FIELD like this:
>
>
(2) I am entirely unhappy with section 3.1.1, starting with its title, which probably should be something like "Datalinks in VOTable columns". And then the first paragraph should probably say something more concrete like perhaps "Columns containing datalinks SHOULD be marked with a UCD of X.Y.Z and a LINK-typed child in its FIELD like this:
Added:
>
>
 
   <LINK whatever="blabla"/>"

And the second paragraph I'd say doesn't belong here at all (it could go to, perhaps, 1.2.7 or a use case discussing datalinks as primary results if we think we need to be explicit about this).

There are use cases behind this. When datalink links response is hooked to table rows outside the context of ObsTAP /SIA2 how do we generate/recognize the DataLink URL ?

Of course we can use the Service Descriptor with the single ID parameter if the DataLink can be parametrized by and "id" from one of the columns. But in that case the descriptor would be doing exactly the same than the LINK element proposed here as included in the appropriate FIELD and is much less verbose. And it's pretty correct VOTable standard. The FIELD itself should not be described by a datalink ucd because it's probably generally an id.

The second paragraph refers to use cases where the URL is not built from the content of one FIELD and when the URL is ad hoc and should be the content of a FIELD. Using the same utypes than the one used in Obscore responses seems reasonable. This is for example adapted to SIA1 or SSA responses. I think this has nothing to do with recursive datalink.

We may try to rephrase all this if this is unclear, but the intent has to be kept.

-- FrancoisBonnarel - 2023-08-12

(3) In 3.2, is says:

If an error occurs while processing an ID value, there \rfcshould\ be at least one row for that ID value and an error\_message

The way the pyvo datalink client is written, we have to make that an unconditional MUST, or pyvo will keep requesting any failing ID (and frankly I'm unsure how else to implement this given multi-ID and overflows): it will only remove an ID of its list of ids to query if it gets at least one row back for it. Perhaps:

A service MUST return at least one row for each ID passed in.

[ceterum censeo we should have let ID be single-valued; it would have made everything soo much simpler and nothing really much harder/slower]

(4) 3.2.2, second paragraph: I had to puzzle quite a bit about this, starting with wondering what a "dereferenceable URL" might be. I'd suggest to replace the entire paragraph with "Access URLs may have fragment parts, which could, for instance, refer to id-ed elements within XML documents or extensions within FITS files. As in URIs in general, the interpretation of a fragment identifier depends on the media type."

"dereferanceable" was used in the sense that it can be fully accessed by http. Which is not the case in URN in general or URL with fragments. For the latter the client is supposed to interpret the fragment. See: https://en.wikipedia.org/wiki/URI_fragment

Apart from that I agree with your rephrasing. FrancoisBonnarel - 2023-08-14"

This will also drop the "No other additional parameters or client handling are allowed." -- if this forbids query strings on access URIs, I'd strongly disagree. If this means something else, we'd have to write that something else.

In version 1.0 we could read:

"The access_url column contains a URL to download a single resource. The URL in this column must be usable as-is with no additional parameters or client handling; it can be a link to a dynamic resource (e.g. preview generation)."

This statement was not consistent with the allowance of fragments, hence the new statement. I can rephrase it in the upcoming PR. -- FrancoisBonnarel - 2023-08-14

(5) In my editoral PR, I've dropped a paragraph on semantics for error_message rows. This is now sufficiently addressed above that passage.

(6) sect. 3.2.9 content_qualifier: I think we should at least name the motivating use case a bit more precisely here, as in, perhaps: "It aids clients in presenting to the user the same sort of link as they go from one dataset to another within a service. For instance, suppose a service serves both continuum and line cubes. Using content_qualifier, users can configure their clients such that, as they change to a new data set, they always see the line cube even when the semantics and content\_type columns agree for both types of data." Or so.

OK for this change. I will adopt it in the next PR. -- FrancoisBonnarel - 2023-08-14

(7) Sect. 4.8: Sorry, you cannot introduce a utype ("adhoc:this") in a section called "Example: X". If you are really, really sure these "self-describing" things are useful, put them into a section of their own.

Me, I've frankly never really understood where you want to go with this, and I think there's no implementation doing any of this, so perhaps we should drop the whole thing. But if we don't drop it and somehow nonchalantly mention it in an example, at least don't introduce a new utype here. What's wrong with the name="this" you had before? You see, having two different mechanisms for what to my knowledge hasn't been implemented even once seems a bit excessive.

When dropping adhoc:this, don't forget that it is referenced in sect 4.1.

The autodescription motivation may be explained earlier in the section. For "adhoc:this" I remember Pat advocating for this. If we motivate earlier then we can restrict to a pure example here. FrancoisBonnarel - 2023-08-14

(8) I have not looked at the DataLinkImp source that's also present in the repo. If you think this ought to become a document, please extract it to a different repo; ivoatex is not designed to support two documents in one repo.

You are right. The note repo will be created in github.com/ivoa. -- FrancoisBonnarel - 2023-08-18

I've also collected a few rather editorial changes in https://github.com/ivoa-std/DataLink/pull/108



Comments from TCG member during the RFC/TCG Review Period: 2023-04-21 - 2023-05-18

WG chairs or vice chairs must read the Document, provide comments if any (including on topics not directly linked to the Group matters) or indicate that they have no comment.

IG chairs or vice chairs are also encouraged to do the same, althought their inputs are not compulsory.

TCG Chair & Vice Chair

Applications Working Group

No comment on the document, we appreciate the presence of examples that clarify the usage and implementation

--
Datalink is used and usage will increase for external webservice like simulated data, output format that are not in IVOA (Hapi Timeseries, OGC format ...)

May be change the datalink page with examples of implementation
refer to the datalink page in the document.
encourage working/interest groups to put examples as Markus did

Data Access Layer Working Group

Some minor edits only, otherwise this update looks sound.

  1. Section references would be useful in the v1.1 changes list - PR#105 raised and merged
  2. Some minor grammar updates - PR #107 raised

-- JamesDempsey - 2023-07-03

Data Model Working Group

I have a rather rudimentary understanding of DataLink, VOSI and DALI, so there are some details that I'm glossing over in my read.

I don't see any real issues/conflicts with the DM group work. However, I have 2 points/questions to raise:

  1. local_semantics: This is an identifier from a local vocabulary to help identify/select rows at a finer level than possible with just the other tags (semantics, content_type, content_qualifier). I'm guessing this is for something like ObsCore 's dataproduct_subtype. My question is that I don't really understand what the value is... is it just the tag? or URI for the local vocabulary + tag? The example serializations are no help since the DaCHs ones seem to resolve into a pretty format and I can't see the actual datalink content, and the CADC examples don't populate this field. I'd like to see, either in the document or examples, something more concrete.
I don't think authors discussed this point too much. IMHO both status would be acceptable. simple terms are enough to associate the results, but local vocab URI + tag allow to link the term to definitions and relationships, so it's reacher. Examples wil be given in the text. FrancoisBonnarel - 2023-08-18
  1. Product Type vocabulary: This directly affects the DM group, it'd be used in the Dataset model and ObsCore could be updated to use it as well. The link in the standard resolves to a 2021 version of the vocabulary. At the interop, a 2023 version was discussed which looked like it had some issues. Which vocabulary would support this REC?
    1. 2023-07-28: The referenced vocabulary now resolves to a version dated 2023-06-26 (though the event-list discussion was just going on this week). The elements and definitions in this list appear compatible with DM group usage in the ObsCore and Dataset models. -- MarkCresitelloDittmar - 2023-07-28
Yes, the product-type ivoa vocabulary is what should be used from now onwards in dataset DM, next version of ObsCore as well as DataLink content_qualifier or maybe also registry standards. -- FrancoisBonnarel - 2023-08-18

-- MarkCresitelloDittmar - 2023-06-21

Followup on revised document:

I see the items above have been addressed satisfactorily, I see no additional issues with the revised document.

-- MarkCresitelloDittmar - 2023-11-11

Grid & Web Services Working Group

Possible backward compatibility drawbacks in VOSpace (VOSpace implementation can use a DataLink to reference data location):

  • new columns of VOTable content_qualifier, local_semantics, link_auth and link_authorized (pgg. 15, 16) could break backward compatibility.
  • pg. 17 it is stated "From version 1.1 of this standard the {links} response must include this INFO ....
  • pg.24 in "Example service descriptor for VOSpace 2.0, attributes "datatype" and "arraysize" are added to <PARAM>

    -- SaraBertocco - 2023-11-10

Registry Working Group

No particular remark pertaining to Registry standards.

Semantics Working Group

No issue for Semantics at this point.

Data Curation & Preservation Interest Group

Education Interest Group

Knowledge Discovery Interest Group

Operations Interest Group

I don't strictly speaking speak for Operations IG as of this week, but since I did most of the review before my term expired, I'll fill it in here; the TCG can decide whether this counts as an Ops endorsement or not.

As one of the authors I'm basically happy with this document, but I will draw attention to one or two issues.

  • Section 2.2 defines the standardID for this standard as ivo://ivoa.net/std/DataLink#links-1.0, followed by the comment "Note this is applicable to endpoints following any version 1.* of the DataLink standard, to avoid backward compatibility problems." In my opinion the backward compatibility problems are not sufficient to justify this choice, and the minor version should be reflected in this standard ID, i.e. it should be "...#links-1.1". This has been discussed in the open github Issue #96, and other authors seem to agree. A fix will require at least an update to the StandardsRegExt record, and also changes in the document to places where the key is referenced, including Section 2.2 and, especially, Section 3.3.1 as well as related example text. This change would amongst other things make it possible for validators to check which minor version they are supposed to be validating against. PatDowler has volunteered to write a Pull Request addressing this issue, but I can have a go if he doesn't.
  • Section 3.3.1 REQUIRES an INFO defining a suitable standardID for links response tables. The example shows such an INFO element as a child of the RESOURCE/@type="results" element, but it's not clear what restrictions there are on the location - does it have to go there, or can it be elsewhere in the VOTable? This should be clarified. If it's not required to be a child of the results resource, the example text in this section should probably be cut down.
This is done in consistency with DALI. DALI seems to insist that INFO elements should be in the primary RESOURCE (name="results"), and that other RESOURCEs may be in the VOTable. We may be more explicit on this. see next PR -- FrancoisBonnarel - 2023-08-18
  • Section 3.2.2: The final sentence says "No other additional parameters or client handling are allowed." I don't understand what is meant here. Should this sentence be removed?
See my answer to Markus above. And next PR. -- FrancoisBonnarel - 2023-08-18
  • As mentioned in Issue #82, the recommended MIME type application/x-votable+xml;content=datalink is using a content-type parameter for VOTable not endorsed in the VOTable standard, which is a bit questionable; but this is not new in this version of DataLink and it will hopefully be addressed in VOTable 1.5, so turning a blind eye is probably OK.

-- MarkTaylor - 2023-05-15

Radio Astronomy Interest Group

Solar System Interest Group

Nothing to add.

-- Anne Raugh - 2023-11-10

Theory Interest Group

Time Domain Interest Group

Just one suggestion, perhaps it would be a good idea to explain the notation with braces (e.g. {link}) in the "Conformance-related definitions" section (page 3).

PierreFernique - 2024-11-08

Added:
>
>
I have added this text to that section (in PR, will nerge before REC):

"This document uses curly braces (e.g. \{name\} to refer to a named concept
such as a web servcie endpoint where the text requires a logical name but
the actual name in a service implementing the standard are not restricted."

-- PatrickDowler 2023-11-11

 

Standards and Processes Committee


TCG Vote : 2023-05-19 - 2023-06-01

If you have minor comments (typos) on the last version of the document please indicate it in the Comments column of the table and post them in the TCG comments section above with the date.

Group Yes No Abstain Comments
TCG        
Apps *      
DAL *      
DM *      
GWS        
Registry *      
Semantics *      
DCP *      
Edu        
KDIG *      
Ops *      
Radio        
SSIG *      
Theory        
TD *      
<nop>StdProc        



<!--
* Set ALLOWTOPICRENAME = TWikiAdminGroup
-->

META FILEATTACHMENT attachment="AladinDesktopDataLink1-1.png" attr="" comment="" date="1681920946" name="AladinDesktopDataLink1-1.png" path="AladinDesktopDataLink1-1.png" size="1238442" user="FrancoisBonnarel" version="1"

Revision 312023-11-11 - MarkCresitelloDittmar

 
META TOPICPARENT name="IvoaTCG"

DataLink 1.1 Proposed Recommendation: Request for Comments


Introduction


DataLink describes the linking of data discovery metadata to access to the data itself, further detailed metadata, related resources, and to services that perform operations on the data.

The main changes in v1.1 are

  • Generalize by adding use cases for links to content other than data files
  • VOSI-availability and VOSI-capabilities endpoints are now optional
  • Service descriptors can include exampleURL and contentType param(s), as well as DESCRIPTION, name, etc...
  • Added optional link_auth and link_authorized to signal whether authentication is necessary to use the link
  • INFO element with standardID mandatory in {links} response
  • Added content_qualifier FIELD to inform on the nature of the link target
  • Added local_semantics to identify similar links in the same DataLink service for different IDs
  • Mechanisms to recognize {links} endpoints outside ObsCore

Latest version of DataLink can be found at:

The GitHub repository for issues and source can be found at: Detailed discussion towards 1-1 can also be found on this ivoa twiki page ( last update by FrancoisBonnarel - 2023-05-04):

Reference Interoperable Implementations

Server side

GAVO implemented the following changes = content_qualifier and local_semantics, service descriptor additional features such as DESCRIPTION? name, content_type, etc....

As a matter of example, a couple of links response for various obscore/sia/ssa tables in GAVO server

All these are combined with various semantics or content_type values

-- MarkusDemleitner - 2023-07-10


CADC has implemented the following in https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops:

  • INFO element with standardID in links response
  • new optional fields in links response: local_semantics (no content yet but can be populated with default vocab in most cases), content_qualifier (no content, not likely to use), link_auth, link_authorized
  • contentType param in service descriptors (where applicable)
IRIS image: https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/IRIS?f212h000/IRAS-25um shows link_auth=optional and link_authorized=true because one can authenticate but the data is public.

new CFHT data: anonymous use of https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/CFHT?2773629/2773629o shows link_auth=optional and link_authorized=false because the data is still proprietary and the caller is anonymous; if an authorized user makes the call they will see authorized=true. It's hard to demonstrate that for a general audience.

The core CADC implementation is available as a library (cadc-datalink-server) in MavenCentral with source code at https://github.com/opencadc/dal.git; the caom2-specific logic is available in a library (caom2-datalink-server) with source at https://github.com/opencadc/caom2service.git -- the core lib is also used in ALMA DataLink service but may not yet be released with the latest features.

Client side

TOPCAT v4.8-8 and later displays additional features in service descriptor and makes use of additional links table FIELD such as content_qualifier, authorization and local_semantics; The tool behavior is adapted to the content of these new FIELDS. For example Activation Actions suggested by topcat for the links not only depend on content_type but also content_qualifier, and local_semantics is used to guess which link a user is interested in based on previous selections.

AladinDesktop is going to adapt to those new features too. (see prototype screenshot)

The CADC DownloadManager (https://github.com/opencadc/apps.git) includes a simple DataLink client class so it can resolve publisherID values into 1..* URLs for download; this code hasn't changed as a result of DataLink -1.1. The CADC AdvancedSearch web portal makes calls to the above caom2ops/datalink service to find previews and download info for each row (publisherID): it makes use of link_authorized to decide to display the download options (or not), which prevents users from selecting downloads/links when they are not authorized and the request will be rejected later.

Implementations Validators

The following validators are available for DataLink

  • datalinklint which is part of STILTS. From STILTS v3.4-8 there is a version parameter that can be set to 1.0 or 1.1 (effectively defaults to v1.1)
  • Show my DataLink which is part of DaCHS



Comments from the IVOA Community during RFC/TCG review period: 2023-04-21 - 2023-05-18

The comments from the TCG members during the RFC/TCG review should be included in the next section.

In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your Wiki Name so that authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.

Additional discussion about any of the comments or responses can be conducted on the WG mailing list. However, please be sure to enter your initial comments here for full consideration in any future revisions of this document

Community Comment by Markus Demleitner

(1) The standard ID: I'm pretty sure we discussed that before, but I'm really unsure how we came to the conclusion that even Datalink 1.1 still has the ivoid of ivo://ivoa.net/std/DataLink#links-1.0.

Yes, we have to do crazy stuff like that for the schema URIs due to the way XML element names are compared. But there is in general no analogous need with ivoids, because we control the rules how to compare them in what situations.

Does anyone remember why we went for links-1.0 here? If not, I'd suggest links-1. I volunteer for adding a brief explanation about how clients should disregard the minor version for normal operations.

Changed:
<
<
(2) I am entirely unhappy with section 3.1.1, starting with its title, which probably should be something like "Datalinks in VOTable columns". And then the first paragraph should probably say something more concrete like perhaps "Columns containing datalinks SHOULD be marked with a UCD of X.Y.Z and a LINK-typed child in its FIELD like this:
>
>
(2) I am entirely unhappy with section 3.1.1, starting with its title, which probably should be something like "Datalinks in VOTable columns". And then the first paragraph should probably say something more concrete like perhaps "Columns containing datalinks SHOULD be marked with a UCD of X.Y.Z and a LINK-typed child in its FIELD like this:
 
   <LINK whatever="blabla"/>"

And the second paragraph I'd say doesn't belong here at all (it could go to, perhaps, 1.2.7 or a use case discussing datalinks as primary results if we think we need to be explicit about this).

There are use cases behind this. When datalink links response is hooked to table rows outside the context of ObsTAP /SIA2 how do we generate/recognize the DataLink URL ?

Of course we can use the Service Descriptor with the single ID parameter if the DataLink can be parametrized by and "id" from one of the columns. But in that case the descriptor would be doing exactly the same than the LINK element proposed here as included in the appropriate FIELD and is much less verbose. And it's pretty correct VOTable standard. The FIELD itself should not be described by a datalink ucd because it's probably generally an id.

The second paragraph refers to use cases where the URL is not built from the content of one FIELD and when the URL is ad hoc and should be the content of a FIELD. Using the same utypes than the one used in Obscore responses seems reasonable. This is for example adapted to SIA1 or SSA responses. I think this has nothing to do with recursive datalink.

We may try to rephrase all this if this is unclear, but the intent has to be kept.

-- FrancoisBonnarel - 2023-08-12

(3) In 3.2, is says:

If an error occurs while processing an ID value, there \rfcshould\ be at least one row for that ID value and an error\_message

The way the pyvo datalink client is written, we have to make that an unconditional MUST, or pyvo will keep requesting any failing ID (and frankly I'm unsure how else to implement this given multi-ID and overflows): it will only remove an ID of its list of ids to query if it gets at least one row back for it. Perhaps:

A service MUST return at least one row for each ID passed in.

[ceterum censeo we should have let ID be single-valued; it would have made everything soo much simpler and nothing really much harder/slower]

(4) 3.2.2, second paragraph: I had to puzzle quite a bit about this, starting with wondering what a "dereferenceable URL" might be. I'd suggest to replace the entire paragraph with "Access URLs may have fragment parts, which could, for instance, refer to id-ed elements within XML documents or extensions within FITS files. As in URIs in general, the interpretation of a fragment identifier depends on the media type."

"dereferanceable" was used in the sense that it can be fully accessed by http. Which is not the case in URN in general or URL with fragments. For the latter the client is supposed to interpret the fragment. See: https://en.wikipedia.org/wiki/URI_fragment

Apart from that I agree with your rephrasing. FrancoisBonnarel - 2023-08-14"

This will also drop the "No other additional parameters or client handling are allowed." -- if this forbids query strings on access URIs, I'd strongly disagree. If this means something else, we'd have to write that something else.

In version 1.0 we could read:

"The access_url column contains a URL to download a single resource. The URL in this column must be usable as-is with no additional parameters or client handling; it can be a link to a dynamic resource (e.g. preview generation)."

This statement was not consistent with the allowance of fragments, hence the new statement. I can rephrase it in the upcoming PR. -- FrancoisBonnarel - 2023-08-14

(5) In my editoral PR, I've dropped a paragraph on semantics for error_message rows. This is now sufficiently addressed above that passage.

(6) sect. 3.2.9 content_qualifier: I think we should at least name the motivating use case a bit more precisely here, as in, perhaps: "It aids clients in presenting to the user the same sort of link as they go from one dataset to another within a service. For instance, suppose a service serves both continuum and line cubes. Using content_qualifier, users can configure their clients such that, as they change to a new data set, they always see the line cube even when the semantics and content\_type columns agree for both types of data." Or so.

OK for this change. I will adopt it in the next PR. -- FrancoisBonnarel - 2023-08-14

(7) Sect. 4.8: Sorry, you cannot introduce a utype ("adhoc:this") in a section called "Example: X". If you are really, really sure these "self-describing" things are useful, put them into a section of their own.

Me, I've frankly never really understood where you want to go with this, and I think there's no implementation doing any of this, so perhaps we should drop the whole thing. But if we don't drop it and somehow nonchalantly mention it in an example, at least don't introduce a new utype here. What's wrong with the name="this" you had before? You see, having two different mechanisms for what to my knowledge hasn't been implemented even once seems a bit excessive.

When dropping adhoc:this, don't forget that it is referenced in sect 4.1.

The autodescription motivation may be explained earlier in the section. For "adhoc:this" I remember Pat advocating for this. If we motivate earlier then we can restrict to a pure example here. FrancoisBonnarel - 2023-08-14

(8) I have not looked at the DataLinkImp source that's also present in the repo. If you think this ought to become a document, please extract it to a different repo; ivoatex is not designed to support two documents in one repo.

You are right. The note repo will be created in github.com/ivoa. -- FrancoisBonnarel - 2023-08-18

I've also collected a few rather editorial changes in https://github.com/ivoa-std/DataLink/pull/108



Comments from TCG member during the RFC/TCG Review Period: 2023-04-21 - 2023-05-18

WG chairs or vice chairs must read the Document, provide comments if any (including on topics not directly linked to the Group matters) or indicate that they have no comment.

IG chairs or vice chairs are also encouraged to do the same, althought their inputs are not compulsory.

TCG Chair & Vice Chair

Applications Working Group

No comment on the document, we appreciate the presence of examples that clarify the usage and implementation

--
Datalink is used and usage will increase for external webservice like simulated data, output format that are not in IVOA (Hapi Timeseries, OGC format ...)

May be change the datalink page with examples of implementation
refer to the datalink page in the document.
encourage working/interest groups to put examples as Markus did

Data Access Layer Working Group

Some minor edits only, otherwise this update looks sound.

  1. Section references would be useful in the v1.1 changes list - PR#105 raised and merged
  2. Some minor grammar updates - PR #107 raised

-- JamesDempsey - 2023-07-03

Data Model Working Group

I have a rather rudimentary understanding of DataLink, VOSI and DALI, so there are some details that I'm glossing over in my read.

I don't see any real issues/conflicts with the DM group work. However, I have 2 points/questions to raise:

  1. local_semantics: This is an identifier from a local vocabulary to help identify/select rows at a finer level than possible with just the other tags (semantics, content_type, content_qualifier). I'm guessing this is for something like ObsCore 's dataproduct_subtype. My question is that I don't really understand what the value is... is it just the tag? or URI for the local vocabulary + tag? The example serializations are no help since the DaCHs ones seem to resolve into a pretty format and I can't see the actual datalink content, and the CADC examples don't populate this field. I'd like to see, either in the document or examples, something more concrete.
I don't think authors discussed this point too much. IMHO both status would be acceptable. simple terms are enough to associate the results, but local vocab URI + tag allow to link the term to definitions and relationships, so it's reacher. Examples wil be given in the text. FrancoisBonnarel - 2023-08-18
  1. Product Type vocabulary: This directly affects the DM group, it'd be used in the Dataset model and ObsCore could be updated to use it as well. The link in the standard resolves to a 2021 version of the vocabulary. At the interop, a 2023 version was discussed which looked like it had some issues. Which vocabulary would support this REC?
    1. 2023-07-28: The referenced vocabulary now resolves to a version dated 2023-06-26 (though the event-list discussion was just going on this week). The elements and definitions in this list appear compatible with DM group usage in the ObsCore and Dataset models. -- MarkCresitelloDittmar - 2023-07-28
Yes, the product-type ivoa vocabulary is what should be used from now onwards in dataset DM, next version of ObsCore as well as DataLink content_qualifier or maybe also registry standards. -- FrancoisBonnarel - 2023-08-18

-- MarkCresitelloDittmar - 2023-06-21

Added:
>
>
Followup on revised document:

I see the items above have been addressed satisfactorily, I see no additional issues with the revised document.

-- MarkCresitelloDittmar - 2023-11-11

 

Grid & Web Services Working Group

Possible backward compatibility drawbacks in VOSpace (VOSpace implementation can use a DataLink to reference data location):

  • new columns of VOTable content_qualifier, local_semantics, link_auth and link_authorized (pgg. 15, 16) could break backward compatibility.
  • pg. 17 it is stated "From version 1.1 of this standard the {links} response must include this INFO ....
  • pg.24 in "Example service descriptor for VOSpace 2.0, attributes "datatype" and "arraysize" are added to <PARAM>

    -- SaraBertocco - 2023-11-10

Registry Working Group

No particular remark pertaining to Registry standards.

Semantics Working Group

No issue for Semantics at this point.

Data Curation & Preservation Interest Group

Education Interest Group

Knowledge Discovery Interest Group

Operations Interest Group

I don't strictly speaking speak for Operations IG as of this week, but since I did most of the review before my term expired, I'll fill it in here; the TCG can decide whether this counts as an Ops endorsement or not.

As one of the authors I'm basically happy with this document, but I will draw attention to one or two issues.

  • Section 2.2 defines the standardID for this standard as ivo://ivoa.net/std/DataLink#links-1.0, followed by the comment "Note this is applicable to endpoints following any version 1.* of the DataLink standard, to avoid backward compatibility problems." In my opinion the backward compatibility problems are not sufficient to justify this choice, and the minor version should be reflected in this standard ID, i.e. it should be "...#links-1.1". This has been discussed in the open github Issue #96, and other authors seem to agree. A fix will require at least an update to the StandardsRegExt record, and also changes in the document to places where the key is referenced, including Section 2.2 and, especially, Section 3.3.1 as well as related example text. This change would amongst other things make it possible for validators to check which minor version they are supposed to be validating against. PatDowler has volunteered to write a Pull Request addressing this issue, but I can have a go if he doesn't.
  • Section 3.3.1 REQUIRES an INFO defining a suitable standardID for links response tables. The example shows such an INFO element as a child of the RESOURCE/@type="results" element, but it's not clear what restrictions there are on the location - does it have to go there, or can it be elsewhere in the VOTable? This should be clarified. If it's not required to be a child of the results resource, the example text in this section should probably be cut down.
This is done in consistency with DALI. DALI seems to insist that INFO elements should be in the primary RESOURCE (name="results"), and that other RESOURCEs may be in the VOTable. We may be more explicit on this. see next PR -- FrancoisBonnarel - 2023-08-18
  • Section 3.2.2: The final sentence says "No other additional parameters or client handling are allowed." I don't understand what is meant here. Should this sentence be removed?
See my answer to Markus above. And next PR. -- FrancoisBonnarel - 2023-08-18
  • As mentioned in Issue #82, the recommended MIME type application/x-votable+xml;content=datalink is using a content-type parameter for VOTable not endorsed in the VOTable standard, which is a bit questionable; but this is not new in this version of DataLink and it will hopefully be addressed in VOTable 1.5, so turning a blind eye is probably OK.

-- MarkTaylor - 2023-05-15

Radio Astronomy Interest Group

Solar System Interest Group

Nothing to add.

-- Anne Raugh - 2023-11-10

Theory Interest Group

Time Domain Interest Group

Just one suggestion, perhaps it would be a good idea to explain the notation with braces (e.g. {link}) in the "Conformance-related definitions" section (page 3).

PierreFernique - 2024-11-08

Standards and Processes Committee


TCG Vote : 2023-05-19 - 2023-06-01

If you have minor comments (typos) on the last version of the document please indicate it in the Comments column of the table and post them in the TCG comments section above with the date.

Group Yes No Abstain Comments
TCG        
Apps *      
DAL *      
Changed:
<
<
DM        
>
>
DM *      
 
GWS        
Registry *      
Semantics *      
DCP *      
Edu        
KDIG *      
Ops *      
Radio        
SSIG *      
Theory        
TD *      
<nop>StdProc        



<!--
* Set ALLOWTOPICRENAME = TWikiAdminGroup
-->

META FILEATTACHMENT attachment="AladinDesktopDataLink1-1.png" attr="" comment="" date="1681920946" name="AladinDesktopDataLink1-1.png" path="AladinDesktopDataLink1-1.png" size="1238442" user="FrancoisBonnarel" version="1"

Revision 302023-11-10 - AnneRaugh

 
META TOPICPARENT name="IvoaTCG"

DataLink 1.1 Proposed Recommendation: Request for Comments


Introduction


DataLink describes the linking of data discovery metadata to access to the data itself, further detailed metadata, related resources, and to services that perform operations on the data.

The main changes in v1.1 are

  • Generalize by adding use cases for links to content other than data files
  • VOSI-availability and VOSI-capabilities endpoints are now optional
  • Service descriptors can include exampleURL and contentType param(s), as well as DESCRIPTION, name, etc...
  • Added optional link_auth and link_authorized to signal whether authentication is necessary to use the link
  • INFO element with standardID mandatory in {links} response
  • Added content_qualifier FIELD to inform on the nature of the link target
  • Added local_semantics to identify similar links in the same DataLink service for different IDs
  • Mechanisms to recognize {links} endpoints outside ObsCore

Latest version of DataLink can be found at:

The GitHub repository for issues and source can be found at: Detailed discussion towards 1-1 can also be found on this ivoa twiki page ( last update by FrancoisBonnarel - 2023-05-04):

Reference Interoperable Implementations

Server side

GAVO implemented the following changes = content_qualifier and local_semantics, service descriptor additional features such as DESCRIPTION? name, content_type, etc....

As a matter of example, a couple of links response for various obscore/sia/ssa tables in GAVO server

All these are combined with various semantics or content_type values

-- MarkusDemleitner - 2023-07-10


CADC has implemented the following in https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops:

  • INFO element with standardID in links response
  • new optional fields in links response: local_semantics (no content yet but can be populated with default vocab in most cases), content_qualifier (no content, not likely to use), link_auth, link_authorized
  • contentType param in service descriptors (where applicable)
IRIS image: https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/IRIS?f212h000/IRAS-25um shows link_auth=optional and link_authorized=true because one can authenticate but the data is public.

new CFHT data: anonymous use of https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/CFHT?2773629/2773629o shows link_auth=optional and link_authorized=false because the data is still proprietary and the caller is anonymous; if an authorized user makes the call they will see authorized=true. It's hard to demonstrate that for a general audience.

The core CADC implementation is available as a library (cadc-datalink-server) in MavenCentral with source code at https://github.com/opencadc/dal.git; the caom2-specific logic is available in a library (caom2-datalink-server) with source at https://github.com/opencadc/caom2service.git -- the core lib is also used in ALMA DataLink service but may not yet be released with the latest features.

Client side

TOPCAT v4.8-8 and later displays additional features in service descriptor and makes use of additional links table FIELD such as content_qualifier, authorization and local_semantics; The tool behavior is adapted to the content of these new FIELDS. For example Activation Actions suggested by topcat for the links not only depend on content_type but also content_qualifier, and local_semantics is used to guess which link a user is interested in based on previous selections.

AladinDesktop is going to adapt to those new features too. (see prototype screenshot)

The CADC DownloadManager (https://github.com/opencadc/apps.git) includes a simple DataLink client class so it can resolve publisherID values into 1..* URLs for download; this code hasn't changed as a result of DataLink -1.1. The CADC AdvancedSearch web portal makes calls to the above caom2ops/datalink service to find previews and download info for each row (publisherID): it makes use of link_authorized to decide to display the download options (or not), which prevents users from selecting downloads/links when they are not authorized and the request will be rejected later.

Implementations Validators

The following validators are available for DataLink

  • datalinklint which is part of STILTS. From STILTS v3.4-8 there is a version parameter that can be set to 1.0 or 1.1 (effectively defaults to v1.1)
  • Show my DataLink which is part of DaCHS



Comments from the IVOA Community during RFC/TCG review period: 2023-04-21 - 2023-05-18

The comments from the TCG members during the RFC/TCG review should be included in the next section.

In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your Wiki Name so that authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.

Additional discussion about any of the comments or responses can be conducted on the WG mailing list. However, please be sure to enter your initial comments here for full consideration in any future revisions of this document

Community Comment by Markus Demleitner

(1) The standard ID: I'm pretty sure we discussed that before, but I'm really unsure how we came to the conclusion that even Datalink 1.1 still has the ivoid of ivo://ivoa.net/std/DataLink#links-1.0.

Yes, we have to do crazy stuff like that for the schema URIs due to the way XML element names are compared. But there is in general no analogous need with ivoids, because we control the rules how to compare them in what situations.

Does anyone remember why we went for links-1.0 here? If not, I'd suggest links-1. I volunteer for adding a brief explanation about how clients should disregard the minor version for normal operations.

(2) I am entirely unhappy with section 3.1.1, starting with its title, which probably should be something like "Datalinks in VOTable columns". And then the first paragraph should probably say something more concrete like perhaps "Columns containing datalinks SHOULD be marked with a UCD of X.Y.Z and a LINK-typed child in its FIELD like this:

Deleted:
<
<
 
   <LINK whatever="blabla"/>"

And the second paragraph I'd say doesn't belong here at all (it could go to, perhaps, 1.2.7 or a use case discussing datalinks as primary results if we think we need to be explicit about this).

There are use cases behind this. When datalink links response is hooked to table rows outside the context of ObsTAP /SIA2 how do we generate/recognize the DataLink URL ?

Of course we can use the Service Descriptor with the single ID parameter if the DataLink can be parametrized by and "id" from one of the columns. But in that case the descriptor would be doing exactly the same than the LINK element proposed here as included in the appropriate FIELD and is much less verbose. And it's pretty correct VOTable standard. The FIELD itself should not be described by a datalink ucd because it's probably generally an id.

The second paragraph refers to use cases where the URL is not built from the content of one FIELD and when the URL is ad hoc and should be the content of a FIELD. Using the same utypes than the one used in Obscore responses seems reasonable. This is for example adapted to SIA1 or SSA responses. I think this has nothing to do with recursive datalink.

We may try to rephrase all this if this is unclear, but the intent has to be kept.

-- FrancoisBonnarel - 2023-08-12

(3) In 3.2, is says:

If an error occurs while processing an ID value, there \rfcshould\ be at least one row for that ID value and an error\_message

The way the pyvo datalink client is written, we have to make that an unconditional MUST, or pyvo will keep requesting any failing ID (and frankly I'm unsure how else to implement this given multi-ID and overflows): it will only remove an ID of its list of ids to query if it gets at least one row back for it. Perhaps:

A service MUST return at least one row for each ID passed in.

[ceterum censeo we should have let ID be single-valued; it would have made everything soo much simpler and nothing really much harder/slower]

(4) 3.2.2, second paragraph: I had to puzzle quite a bit about this, starting with wondering what a "dereferenceable URL" might be. I'd suggest to replace the entire paragraph with "Access URLs may have fragment parts, which could, for instance, refer to id-ed elements within XML documents or extensions within FITS files. As in URIs in general, the interpretation of a fragment identifier depends on the media type."

"dereferanceable" was used in the sense that it can be fully accessed by http. Which is not the case in URN in general or URL with fragments. For the latter the client is supposed to interpret the fragment. See: https://en.wikipedia.org/wiki/URI_fragment

Apart from that I agree with your rephrasing. FrancoisBonnarel - 2023-08-14"

This will also drop the "No other additional parameters or client handling are allowed." -- if this forbids query strings on access URIs, I'd strongly disagree. If this means something else, we'd have to write that something else.

In version 1.0 we could read:

"The access_url column contains a URL to download a single resource. The URL in this column must be usable as-is with no additional parameters or client handling; it can be a link to a dynamic resource (e.g. preview generation)."

This statement was not consistent with the allowance of fragments, hence the new statement. I can rephrase it in the upcoming PR. -- FrancoisBonnarel - 2023-08-14

(5) In my editoral PR, I've dropped a paragraph on semantics for error_message rows. This is now sufficiently addressed above that passage.

(6) sect. 3.2.9 content_qualifier: I think we should at least name the motivating use case a bit more precisely here, as in, perhaps: "It aids clients in presenting to the user the same sort of link as they go from one dataset to another within a service. For instance, suppose a service serves both continuum and line cubes. Using content_qualifier, users can configure their clients such that, as they change to a new data set, they always see the line cube even when the semantics and content\_type columns agree for both types of data." Or so.

OK for this change. I will adopt it in the next PR. -- FrancoisBonnarel - 2023-08-14

(7) Sect. 4.8: Sorry, you cannot introduce a utype ("adhoc:this") in a section called "Example: X". If you are really, really sure these "self-describing" things are useful, put them into a section of their own.

Me, I've frankly never really understood where you want to go with this, and I think there's no implementation doing any of this, so perhaps we should drop the whole thing. But if we don't drop it and somehow nonchalantly mention it in an example, at least don't introduce a new utype here. What's wrong with the name="this" you had before? You see, having two different mechanisms for what to my knowledge hasn't been implemented even once seems a bit excessive.

When dropping adhoc:this, don't forget that it is referenced in sect 4.1.

The autodescription motivation may be explained earlier in the section. For "adhoc:this" I remember Pat advocating for this. If we motivate earlier then we can restrict to a pure example here. FrancoisBonnarel - 2023-08-14

(8) I have not looked at the DataLinkImp source that's also present in the repo. If you think this ought to become a document, please extract it to a different repo; ivoatex is not designed to support two documents in one repo.

You are right. The note repo will be created in github.com/ivoa. -- FrancoisBonnarel - 2023-08-18

I've also collected a few rather editorial changes in https://github.com/ivoa-std/DataLink/pull/108



Comments from TCG member during the RFC/TCG Review Period: 2023-04-21 - 2023-05-18

WG chairs or vice chairs must read the Document, provide comments if any (including on topics not directly linked to the Group matters) or indicate that they have no comment.

IG chairs or vice chairs are also encouraged to do the same, althought their inputs are not compulsory.

TCG Chair & Vice Chair

Applications Working Group

No comment on the document, we appreciate the presence of examples that clarify the usage and implementation

--
Datalink is used and usage will increase for external webservice like simulated data, output format that are not in IVOA (Hapi Timeseries, OGC format ...)

May be change the datalink page with examples of implementation
refer to the datalink page in the document.
encourage working/interest groups to put examples as Markus did

Data Access Layer Working Group

Some minor edits only, otherwise this update looks sound.

  1. Section references would be useful in the v1.1 changes list - PR#105 raised and merged
  2. Some minor grammar updates - PR #107 raised

-- JamesDempsey - 2023-07-03

Data Model Working Group

I have a rather rudimentary understanding of DataLink, VOSI and DALI, so there are some details that I'm glossing over in my read.

I don't see any real issues/conflicts with the DM group work. However, I have 2 points/questions to raise:

  1. local_semantics: This is an identifier from a local vocabulary to help identify/select rows at a finer level than possible with just the other tags (semantics, content_type, content_qualifier). I'm guessing this is for something like ObsCore 's dataproduct_subtype. My question is that I don't really understand what the value is... is it just the tag? or URI for the local vocabulary + tag? The example serializations are no help since the DaCHs ones seem to resolve into a pretty format and I can't see the actual datalink content, and the CADC examples don't populate this field. I'd like to see, either in the document or examples, something more concrete.
I don't think authors discussed this point too much. IMHO both status would be acceptable. simple terms are enough to associate the results, but local vocab URI + tag allow to link the term to definitions and relationships, so it's reacher. Examples wil be given in the text. FrancoisBonnarel - 2023-08-18
  1. Product Type vocabulary: This directly affects the DM group, it'd be used in the Dataset model and ObsCore could be updated to use it as well. The link in the standard resolves to a 2021 version of the vocabulary. At the interop, a 2023 version was discussed which looked like it had some issues. Which vocabulary would support this REC?
    1. 2023-07-28: The referenced vocabulary now resolves to a version dated 2023-06-26 (though the event-list discussion was just going on this week). The elements and definitions in this list appear compatible with DM group usage in the ObsCore and Dataset models. -- MarkCresitelloDittmar - 2023-07-28
Yes, the product-type ivoa vocabulary is what should be used from now onwards in dataset DM, next version of ObsCore as well as DataLink content_qualifier or maybe also registry standards. -- FrancoisBonnarel - 2023-08-18

-- MarkCresitelloDittmar - 2023-06-21

Grid & Web Services Working Group

Possible backward compatibility drawbacks in VOSpace (VOSpace implementation can use a DataLink to reference data location):

  • new columns of VOTable content_qualifier, local_semantics, link_auth and link_authorized (pgg. 15, 16) could break backward compatibility.
  • pg. 17 it is stated "From version 1.1 of this standard the {links} response must include this INFO ....
  • pg.24 in "Example service descriptor for VOSpace 2.0, attributes "datatype" and "arraysize" are added to <PARAM>

    -- SaraBertocco - 2023-11-10

Registry Working Group

No particular remark pertaining to Registry standards.

Semantics Working Group

No issue for Semantics at this point.

Data Curation & Preservation Interest Group

Education Interest Group

Knowledge Discovery Interest Group

Operations Interest Group

I don't strictly speaking speak for Operations IG as of this week, but since I did most of the review before my term expired, I'll fill it in here; the TCG can decide whether this counts as an Ops endorsement or not.

As one of the authors I'm basically happy with this document, but I will draw attention to one or two issues.

  • Section 2.2 defines the standardID for this standard as ivo://ivoa.net/std/DataLink#links-1.0, followed by the comment "Note this is applicable to endpoints following any version 1.* of the DataLink standard, to avoid backward compatibility problems." In my opinion the backward compatibility problems are not sufficient to justify this choice, and the minor version should be reflected in this standard ID, i.e. it should be "...#links-1.1". This has been discussed in the open github Issue #96, and other authors seem to agree. A fix will require at least an update to the StandardsRegExt record, and also changes in the document to places where the key is referenced, including Section 2.2 and, especially, Section 3.3.1 as well as related example text. This change would amongst other things make it possible for validators to check which minor version they are supposed to be validating against. PatDowler has volunteered to write a Pull Request addressing this issue, but I can have a go if he doesn't.
  • Section 3.3.1 REQUIRES an INFO defining a suitable standardID for links response tables. The example shows such an INFO element as a child of the RESOURCE/@type="results" element, but it's not clear what restrictions there are on the location - does it have to go there, or can it be elsewhere in the VOTable? This should be clarified. If it's not required to be a child of the results resource, the example text in this section should probably be cut down.
This is done in consistency with DALI. DALI seems to insist that INFO elements should be in the primary RESOURCE (name="results"), and that other RESOURCEs may be in the VOTable. We may be more explicit on this. see next PR -- FrancoisBonnarel - 2023-08-18
  • Section 3.2.2: The final sentence says "No other additional parameters or client handling are allowed." I don't understand what is meant here. Should this sentence be removed?
See my answer to Markus above. And next PR. -- FrancoisBonnarel - 2023-08-18
  • As mentioned in Issue #82, the recommended MIME type application/x-votable+xml;content=datalink is using a content-type parameter for VOTable not endorsed in the VOTable standard, which is a bit questionable; but this is not new in this version of DataLink and it will hopefully be addressed in VOTable 1.5, so turning a blind eye is probably OK.

-- MarkTaylor - 2023-05-15

Radio Astronomy Interest Group

Solar System Interest Group

Added:
>
>
Nothing to add.

-- Anne Raugh - 2023-11-10

 

Theory Interest Group

Time Domain Interest Group

Just one suggestion, perhaps it would be a good idea to explain the notation with braces (e.g. {link}) in the "Conformance-related definitions" section (page 3).

PierreFernique - 2024-11-08

Standards and Processes Committee


TCG Vote : 2023-05-19 - 2023-06-01

If you have minor comments (typos) on the last version of the document please indicate it in the Comments column of the table and post them in the TCG comments section above with the date.

Group Yes No Abstain Comments
TCG        
Apps *      
DAL *      
DM        
GWS        
Registry *      
Semantics *      
DCP *      
Edu        
KDIG *      
Ops *      
Radio        
Changed:
<
<
SSIG        
>
>
SSIG *      
 
Theory        
TD *      
<nop>StdProc        



<!--
* Set ALLOWTOPICRENAME = TWikiAdminGroup
-->

META FILEATTACHMENT attachment="AladinDesktopDataLink1-1.png" attr="" comment="" date="1681920946" name="AladinDesktopDataLink1-1.png" path="AladinDesktopDataLink1-1.png" size="1238442" user="FrancoisBonnarel" version="1"

Revision 292023-11-10 - SaraBertocco

 
META TOPICPARENT name="IvoaTCG"

DataLink 1.1 Proposed Recommendation: Request for Comments


Introduction


DataLink describes the linking of data discovery metadata to access to the data itself, further detailed metadata, related resources, and to services that perform operations on the data.

The main changes in v1.1 are

  • Generalize by adding use cases for links to content other than data files
  • VOSI-availability and VOSI-capabilities endpoints are now optional
  • Service descriptors can include exampleURL and contentType param(s), as well as DESCRIPTION, name, etc...
  • Added optional link_auth and link_authorized to signal whether authentication is necessary to use the link
  • INFO element with standardID mandatory in {links} response
  • Added content_qualifier FIELD to inform on the nature of the link target
  • Added local_semantics to identify similar links in the same DataLink service for different IDs
  • Mechanisms to recognize {links} endpoints outside ObsCore

Latest version of DataLink can be found at:

The GitHub repository for issues and source can be found at: Detailed discussion towards 1-1 can also be found on this ivoa twiki page ( last update by FrancoisBonnarel - 2023-05-04):

Reference Interoperable Implementations

Server side

GAVO implemented the following changes = content_qualifier and local_semantics, service descriptor additional features such as DESCRIPTION? name, content_type, etc....

As a matter of example, a couple of links response for various obscore/sia/ssa tables in GAVO server

All these are combined with various semantics or content_type values

-- MarkusDemleitner - 2023-07-10


CADC has implemented the following in https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops:

  • INFO element with standardID in links response
  • new optional fields in links response: local_semantics (no content yet but can be populated with default vocab in most cases), content_qualifier (no content, not likely to use), link_auth, link_authorized
  • contentType param in service descriptors (where applicable)
IRIS image: https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/IRIS?f212h000/IRAS-25um shows link_auth=optional and link_authorized=true because one can authenticate but the data is public.

new CFHT data: anonymous use of https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/CFHT?2773629/2773629o shows link_auth=optional and link_authorized=false because the data is still proprietary and the caller is anonymous; if an authorized user makes the call they will see authorized=true. It's hard to demonstrate that for a general audience.

The core CADC implementation is available as a library (cadc-datalink-server) in MavenCentral with source code at https://github.com/opencadc/dal.git; the caom2-specific logic is available in a library (caom2-datalink-server) with source at https://github.com/opencadc/caom2service.git -- the core lib is also used in ALMA DataLink service but may not yet be released with the latest features.

Client side

TOPCAT v4.8-8 and later displays additional features in service descriptor and makes use of additional links table FIELD such as content_qualifier, authorization and local_semantics; The tool behavior is adapted to the content of these new FIELDS. For example Activation Actions suggested by topcat for the links not only depend on content_type but also content_qualifier, and local_semantics is used to guess which link a user is interested in based on previous selections.

AladinDesktop is going to adapt to those new features too. (see prototype screenshot)

The CADC DownloadManager (https://github.com/opencadc/apps.git) includes a simple DataLink client class so it can resolve publisherID values into 1..* URLs for download; this code hasn't changed as a result of DataLink -1.1. The CADC AdvancedSearch web portal makes calls to the above caom2ops/datalink service to find previews and download info for each row (publisherID): it makes use of link_authorized to decide to display the download options (or not), which prevents users from selecting downloads/links when they are not authorized and the request will be rejected later.

Implementations Validators

The following validators are available for DataLink

  • datalinklint which is part of STILTS. From STILTS v3.4-8 there is a version parameter that can be set to 1.0 or 1.1 (effectively defaults to v1.1)
  • Show my DataLink which is part of DaCHS



Comments from the IVOA Community during RFC/TCG review period: 2023-04-21 - 2023-05-18

The comments from the TCG members during the RFC/TCG review should be included in the next section.

In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your Wiki Name so that authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.

Additional discussion about any of the comments or responses can be conducted on the WG mailing list. However, please be sure to enter your initial comments here for full consideration in any future revisions of this document

Community Comment by Markus Demleitner

(1) The standard ID: I'm pretty sure we discussed that before, but I'm really unsure how we came to the conclusion that even Datalink 1.1 still has the ivoid of ivo://ivoa.net/std/DataLink#links-1.0.

Yes, we have to do crazy stuff like that for the schema URIs due to the way XML element names are compared. But there is in general no analogous need with ivoids, because we control the rules how to compare them in what situations.

Does anyone remember why we went for links-1.0 here? If not, I'd suggest links-1. I volunteer for adding a brief explanation about how clients should disregard the minor version for normal operations.

(2) I am entirely unhappy with section 3.1.1, starting with its title, which probably should be something like "Datalinks in VOTable columns". And then the first paragraph should probably say something more concrete like perhaps "Columns containing datalinks SHOULD be marked with a UCD of X.Y.Z and a LINK-typed child in its FIELD like this:

Added:
>
>
 
   <LINK whatever="blabla"/>"

And the second paragraph I'd say doesn't belong here at all (it could go to, perhaps, 1.2.7 or a use case discussing datalinks as primary results if we think we need to be explicit about this).

There are use cases behind this. When datalink links response is hooked to table rows outside the context of ObsTAP /SIA2 how do we generate/recognize the DataLink URL ?

Of course we can use the Service Descriptor with the single ID parameter if the DataLink can be parametrized by and "id" from one of the columns. But in that case the descriptor would be doing exactly the same than the LINK element proposed here as included in the appropriate FIELD and is much less verbose. And it's pretty correct VOTable standard. The FIELD itself should not be described by a datalink ucd because it's probably generally an id.

The second paragraph refers to use cases where the URL is not built from the content of one FIELD and when the URL is ad hoc and should be the content of a FIELD. Using the same utypes than the one used in Obscore responses seems reasonable. This is for example adapted to SIA1 or SSA responses. I think this has nothing to do with recursive datalink.

We may try to rephrase all this if this is unclear, but the intent has to be kept.

-- FrancoisBonnarel - 2023-08-12

(3) In 3.2, is says:

If an error occurs while processing an ID value, there \rfcshould\ be at least one row for that ID value and an error\_message

The way the pyvo datalink client is written, we have to make that an unconditional MUST, or pyvo will keep requesting any failing ID (and frankly I'm unsure how else to implement this given multi-ID and overflows): it will only remove an ID of its list of ids to query if it gets at least one row back for it. Perhaps:

A service MUST return at least one row for each ID passed in.

[ceterum censeo we should have let ID be single-valued; it would have made everything soo much simpler and nothing really much harder/slower]

(4) 3.2.2, second paragraph: I had to puzzle quite a bit about this, starting with wondering what a "dereferenceable URL" might be. I'd suggest to replace the entire paragraph with "Access URLs may have fragment parts, which could, for instance, refer to id-ed elements within XML documents or extensions within FITS files. As in URIs in general, the interpretation of a fragment identifier depends on the media type."

"dereferanceable" was used in the sense that it can be fully accessed by http. Which is not the case in URN in general or URL with fragments. For the latter the client is supposed to interpret the fragment. See: https://en.wikipedia.org/wiki/URI_fragment

Apart from that I agree with your rephrasing. FrancoisBonnarel - 2023-08-14"

This will also drop the "No other additional parameters or client handling are allowed." -- if this forbids query strings on access URIs, I'd strongly disagree. If this means something else, we'd have to write that something else.

In version 1.0 we could read:

"The access_url column contains a URL to download a single resource. The URL in this column must be usable as-is with no additional parameters or client handling; it can be a link to a dynamic resource (e.g. preview generation)."

This statement was not consistent with the allowance of fragments, hence the new statement. I can rephrase it in the upcoming PR. -- FrancoisBonnarel - 2023-08-14

(5) In my editoral PR, I've dropped a paragraph on semantics for error_message rows. This is now sufficiently addressed above that passage.

(6) sect. 3.2.9 content_qualifier: I think we should at least name the motivating use case a bit more precisely here, as in, perhaps: "It aids clients in presenting to the user the same sort of link as they go from one dataset to another within a service. For instance, suppose a service serves both continuum and line cubes. Using content_qualifier, users can configure their clients such that, as they change to a new data set, they always see the line cube even when the semantics and content\_type columns agree for both types of data." Or so.

OK for this change. I will adopt it in the next PR. -- FrancoisBonnarel - 2023-08-14

(7) Sect. 4.8: Sorry, you cannot introduce a utype ("adhoc:this") in a section called "Example: X". If you are really, really sure these "self-describing" things are useful, put them into a section of their own.

Me, I've frankly never really understood where you want to go with this, and I think there's no implementation doing any of this, so perhaps we should drop the whole thing. But if we don't drop it and somehow nonchalantly mention it in an example, at least don't introduce a new utype here. What's wrong with the name="this" you had before? You see, having two different mechanisms for what to my knowledge hasn't been implemented even once seems a bit excessive.

When dropping adhoc:this, don't forget that it is referenced in sect 4.1.

The autodescription motivation may be explained earlier in the section. For "adhoc:this" I remember Pat advocating for this. If we motivate earlier then we can restrict to a pure example here. FrancoisBonnarel - 2023-08-14

(8) I have not looked at the DataLinkImp source that's also present in the repo. If you think this ought to become a document, please extract it to a different repo; ivoatex is not designed to support two documents in one repo.

You are right. The note repo will be created in github.com/ivoa. -- FrancoisBonnarel - 2023-08-18

I've also collected a few rather editorial changes in https://github.com/ivoa-std/DataLink/pull/108



Comments from TCG member during the RFC/TCG Review Period: 2023-04-21 - 2023-05-18

WG chairs or vice chairs must read the Document, provide comments if any (including on topics not directly linked to the Group matters) or indicate that they have no comment.

IG chairs or vice chairs are also encouraged to do the same, althought their inputs are not compulsory.

TCG Chair & Vice Chair

Applications Working Group

No comment on the document, we appreciate the presence of examples that clarify the usage and implementation

--
Datalink is used and usage will increase for external webservice like simulated data, output format that are not in IVOA (Hapi Timeseries, OGC format ...)

May be change the datalink page with examples of implementation
refer to the datalink page in the document.
encourage working/interest groups to put examples as Markus did

Data Access Layer Working Group

Some minor edits only, otherwise this update looks sound.

  1. Section references would be useful in the v1.1 changes list - PR#105 raised and merged
  2. Some minor grammar updates - PR #107 raised

-- JamesDempsey - 2023-07-03

Data Model Working Group

I have a rather rudimentary understanding of DataLink, VOSI and DALI, so there are some details that I'm glossing over in my read.

I don't see any real issues/conflicts with the DM group work. However, I have 2 points/questions to raise:

  1. local_semantics: This is an identifier from a local vocabulary to help identify/select rows at a finer level than possible with just the other tags (semantics, content_type, content_qualifier). I'm guessing this is for something like ObsCore 's dataproduct_subtype. My question is that I don't really understand what the value is... is it just the tag? or URI for the local vocabulary + tag? The example serializations are no help since the DaCHs ones seem to resolve into a pretty format and I can't see the actual datalink content, and the CADC examples don't populate this field. I'd like to see, either in the document or examples, something more concrete.
I don't think authors discussed this point too much. IMHO both status would be acceptable. simple terms are enough to associate the results, but local vocab URI + tag allow to link the term to definitions and relationships, so it's reacher. Examples wil be given in the text. FrancoisBonnarel - 2023-08-18
  1. Product Type vocabulary: This directly affects the DM group, it'd be used in the Dataset model and ObsCore could be updated to use it as well. The link in the standard resolves to a 2021 version of the vocabulary. At the interop, a 2023 version was discussed which looked like it had some issues. Which vocabulary would support this REC?
    1. 2023-07-28: The referenced vocabulary now resolves to a version dated 2023-06-26 (though the event-list discussion was just going on this week). The elements and definitions in this list appear compatible with DM group usage in the ObsCore and Dataset models. -- MarkCresitelloDittmar - 2023-07-28
Yes, the product-type ivoa vocabulary is what should be used from now onwards in dataset DM, next version of ObsCore as well as DataLink content_qualifier or maybe also registry standards. -- FrancoisBonnarel - 2023-08-18

-- MarkCresitelloDittmar - 2023-06-21

Grid & Web Services Working Group

Possible backward compatibility drawbacks in VOSpace (VOSpace implementation can use a DataLink to reference data location):

  • new columns of VOTable content_qualifier, local_semantics, link_auth and link_authorized (pgg. 15, 16) could break backward compatibility.
  • pg. 17 it is stated "From version 1.1 of this standard the {links} response must include this INFO ....
Changed:
<
<
  • pg.24 in "Example service descriptor for VOSpace 2.0, attributes "datatype" and "arraysize" are added to <PARAM>
>
>
  • pg.24 in "Example service descriptor for VOSpace 2.0, attributes "datatype" and "arraysize" are added to <PARAM>

    -- SaraBertocco - 2023-11-10

 

Registry Working Group

No particular remark pertaining to Registry standards.

Semantics Working Group

No issue for Semantics at this point.

Data Curation & Preservation Interest Group

Education Interest Group

Knowledge Discovery Interest Group

Operations Interest Group

I don't strictly speaking speak for Operations IG as of this week, but since I did most of the review before my term expired, I'll fill it in here; the TCG can decide whether this counts as an Ops endorsement or not.

As one of the authors I'm basically happy with this document, but I will draw attention to one or two issues.

  • Section 2.2 defines the standardID for this standard as ivo://ivoa.net/std/DataLink#links-1.0, followed by the comment "Note this is applicable to endpoints following any version 1.* of the DataLink standard, to avoid backward compatibility problems." In my opinion the backward compatibility problems are not sufficient to justify this choice, and the minor version should be reflected in this standard ID, i.e. it should be "...#links-1.1". This has been discussed in the open github Issue #96, and other authors seem to agree. A fix will require at least an update to the StandardsRegExt record, and also changes in the document to places where the key is referenced, including Section 2.2 and, especially, Section 3.3.1 as well as related example text. This change would amongst other things make it possible for validators to check which minor version they are supposed to be validating against. PatDowler has volunteered to write a Pull Request addressing this issue, but I can have a go if he doesn't.
  • Section 3.3.1 REQUIRES an INFO defining a suitable standardID for links response tables. The example shows such an INFO element as a child of the RESOURCE/@type="results" element, but it's not clear what restrictions there are on the location - does it have to go there, or can it be elsewhere in the VOTable? This should be clarified. If it's not required to be a child of the results resource, the example text in this section should probably be cut down.
This is done in consistency with DALI. DALI seems to insist that INFO elements should be in the primary RESOURCE (name="results"), and that other RESOURCEs may be in the VOTable. We may be more explicit on this. see next PR -- FrancoisBonnarel - 2023-08-18
  • Section 3.2.2: The final sentence says "No other additional parameters or client handling are allowed." I don't understand what is meant here. Should this sentence be removed?
See my answer to Markus above. And next PR. -- FrancoisBonnarel - 2023-08-18
  • As mentioned in Issue #82, the recommended MIME type application/x-votable+xml;content=datalink is using a content-type parameter for VOTable not endorsed in the VOTable standard, which is a bit questionable; but this is not new in this version of DataLink and it will hopefully be addressed in VOTable 1.5, so turning a blind eye is probably OK.

-- MarkTaylor - 2023-05-15

Radio Astronomy Interest Group

Solar System Interest Group

Theory Interest Group

Time Domain Interest Group

Just one suggestion, perhaps it would be a good idea to explain the notation with braces (e.g. {link}) in the "Conformance-related definitions" section (page 3).

PierreFernique - 2024-11-08

Standards and Processes Committee


TCG Vote : 2023-05-19 - 2023-06-01

If you have minor comments (typos) on the last version of the document please indicate it in the Comments column of the table and post them in the TCG comments section above with the date.

Group Yes No Abstain Comments
TCG        
Apps *      
DAL *      
DM        
GWS        
Registry *      
Semantics *      
DCP *      
Edu        
KDIG *      
Ops *      
Radio        
SSIG        
Theory        
TD *      
<nop>StdProc        



<!--
* Set ALLOWTOPICRENAME = TWikiAdminGroup
-->

META FILEATTACHMENT attachment="AladinDesktopDataLink1-1.png" attr="" comment="" date="1681920946" name="AladinDesktopDataLink1-1.png" path="AladinDesktopDataLink1-1.png" size="1238442" user="FrancoisBonnarel" version="1"

Revision 282023-11-10 - BaptisteCecconi

 
META TOPICPARENT name="IvoaTCG"

DataLink 1.1 Proposed Recommendation: Request for Comments


Introduction


DataLink describes the linking of data discovery metadata to access to the data itself, further detailed metadata, related resources, and to services that perform operations on the data.

The main changes in v1.1 are

  • Generalize by adding use cases for links to content other than data files
  • VOSI-availability and VOSI-capabilities endpoints are now optional
  • Service descriptors can include exampleURL and contentType param(s), as well as DESCRIPTION, name, etc...
  • Added optional link_auth and link_authorized to signal whether authentication is necessary to use the link
  • INFO element with standardID mandatory in {links} response
  • Added content_qualifier FIELD to inform on the nature of the link target
  • Added local_semantics to identify similar links in the same DataLink service for different IDs
  • Mechanisms to recognize {links} endpoints outside ObsCore

Latest version of DataLink can be found at:

The GitHub repository for issues and source can be found at: Detailed discussion towards 1-1 can also be found on this ivoa twiki page ( last update by FrancoisBonnarel - 2023-05-04):

Reference Interoperable Implementations

Server side

GAVO implemented the following changes = content_qualifier and local_semantics, service descriptor additional features such as DESCRIPTION? name, content_type, etc....

As a matter of example, a couple of links response for various obscore/sia/ssa tables in GAVO server

All these are combined with various semantics or content_type values

-- MarkusDemleitner - 2023-07-10


CADC has implemented the following in https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops:

  • INFO element with standardID in links response
  • new optional fields in links response: local_semantics (no content yet but can be populated with default vocab in most cases), content_qualifier (no content, not likely to use), link_auth, link_authorized
  • contentType param in service descriptors (where applicable)
IRIS image: https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/IRIS?f212h000/IRAS-25um shows link_auth=optional and link_authorized=true because one can authenticate but the data is public.

new CFHT data: anonymous use of https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/CFHT?2773629/2773629o shows link_auth=optional and link_authorized=false because the data is still proprietary and the caller is anonymous; if an authorized user makes the call they will see authorized=true. It's hard to demonstrate that for a general audience.

The core CADC implementation is available as a library (cadc-datalink-server) in MavenCentral with source code at https://github.com/opencadc/dal.git; the caom2-specific logic is available in a library (caom2-datalink-server) with source at https://github.com/opencadc/caom2service.git -- the core lib is also used in ALMA DataLink service but may not yet be released with the latest features.

Client side

TOPCAT v4.8-8 and later displays additional features in service descriptor and makes use of additional links table FIELD such as content_qualifier, authorization and local_semantics; The tool behavior is adapted to the content of these new FIELDS. For example Activation Actions suggested by topcat for the links not only depend on content_type but also content_qualifier, and local_semantics is used to guess which link a user is interested in based on previous selections.

AladinDesktop is going to adapt to those new features too. (see prototype screenshot)

The CADC DownloadManager (https://github.com/opencadc/apps.git) includes a simple DataLink client class so it can resolve publisherID values into 1..* URLs for download; this code hasn't changed as a result of DataLink -1.1. The CADC AdvancedSearch web portal makes calls to the above caom2ops/datalink service to find previews and download info for each row (publisherID): it makes use of link_authorized to decide to display the download options (or not), which prevents users from selecting downloads/links when they are not authorized and the request will be rejected later.

Implementations Validators

The following validators are available for DataLink

  • datalinklint which is part of STILTS. From STILTS v3.4-8 there is a version parameter that can be set to 1.0 or 1.1 (effectively defaults to v1.1)
  • Show my DataLink which is part of DaCHS



Comments from the IVOA Community during RFC/TCG review period: 2023-04-21 - 2023-05-18

The comments from the TCG members during the RFC/TCG review should be included in the next section.

In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your Wiki Name so that authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.

Additional discussion about any of the comments or responses can be conducted on the WG mailing list. However, please be sure to enter your initial comments here for full consideration in any future revisions of this document

Community Comment by Markus Demleitner

(1) The standard ID: I'm pretty sure we discussed that before, but I'm really unsure how we came to the conclusion that even Datalink 1.1 still has the ivoid of ivo://ivoa.net/std/DataLink#links-1.0.

Yes, we have to do crazy stuff like that for the schema URIs due to the way XML element names are compared. But there is in general no analogous need with ivoids, because we control the rules how to compare them in what situations.

Does anyone remember why we went for links-1.0 here? If not, I'd suggest links-1. I volunteer for adding a brief explanation about how clients should disregard the minor version for normal operations.

(2) I am entirely unhappy with section 3.1.1, starting with its title, which probably should be something like "Datalinks in VOTable columns". And then the first paragraph should probably say something more concrete like perhaps "Columns containing datalinks SHOULD be marked with a UCD of X.Y.Z and a LINK-typed child in its FIELD like this:

   <LINK whatever="blabla"/>"

And the second paragraph I'd say doesn't belong here at all (it could go to, perhaps, 1.2.7 or a use case discussing datalinks as primary results if we think we need to be explicit about this).

There are use cases behind this. When datalink links response is hooked to table rows outside the context of ObsTAP /SIA2 how do we generate/recognize the DataLink URL ?

Of course we can use the Service Descriptor with the single ID parameter if the DataLink can be parametrized by and "id" from one of the columns. But in that case the descriptor would be doing exactly the same than the LINK element proposed here as included in the appropriate FIELD and is much less verbose. And it's pretty correct VOTable standard. The FIELD itself should not be described by a datalink ucd because it's probably generally an id.

The second paragraph refers to use cases where the URL is not built from the content of one FIELD and when the URL is ad hoc and should be the content of a FIELD. Using the same utypes than the one used in Obscore responses seems reasonable. This is for example adapted to SIA1 or SSA responses. I think this has nothing to do with recursive datalink.

We may try to rephrase all this if this is unclear, but the intent has to be kept.

-- FrancoisBonnarel - 2023-08-12

(3) In 3.2, is says:

If an error occurs while processing an ID value, there \rfcshould\ be at least one row for that ID value and an error\_message

The way the pyvo datalink client is written, we have to make that an unconditional MUST, or pyvo will keep requesting any failing ID (and frankly I'm unsure how else to implement this given multi-ID and overflows): it will only remove an ID of its list of ids to query if it gets at least one row back for it. Perhaps:

A service MUST return at least one row for each ID passed in.

[ceterum censeo we should have let ID be single-valued; it would have made everything soo much simpler and nothing really much harder/slower]

(4) 3.2.2, second paragraph: I had to puzzle quite a bit about this, starting with wondering what a "dereferenceable URL" might be. I'd suggest to replace the entire paragraph with "Access URLs may have fragment parts, which could, for instance, refer to id-ed elements within XML documents or extensions within FITS files. As in URIs in general, the interpretation of a fragment identifier depends on the media type."

"dereferanceable" was used in the sense that it can be fully accessed by http. Which is not the case in URN in general or URL with fragments. For the latter the client is supposed to interpret the fragment. See: https://en.wikipedia.org/wiki/URI_fragment

Apart from that I agree with your rephrasing. FrancoisBonnarel - 2023-08-14"

This will also drop the "No other additional parameters or client handling are allowed." -- if this forbids query strings on access URIs, I'd strongly disagree. If this means something else, we'd have to write that something else.

In version 1.0 we could read:

"The access_url column contains a URL to download a single resource. The URL in this column must be usable as-is with no additional parameters or client handling; it can be a link to a dynamic resource (e.g. preview generation)."

This statement was not consistent with the allowance of fragments, hence the new statement. I can rephrase it in the upcoming PR. -- FrancoisBonnarel - 2023-08-14

(5) In my editoral PR, I've dropped a paragraph on semantics for error_message rows. This is now sufficiently addressed above that passage.

(6) sect. 3.2.9 content_qualifier: I think we should at least name the motivating use case a bit more precisely here, as in, perhaps: "It aids clients in presenting to the user the same sort of link as they go from one dataset to another within a service. For instance, suppose a service serves both continuum and line cubes. Using content_qualifier, users can configure their clients such that, as they change to a new data set, they always see the line cube even when the semantics and content\_type columns agree for both types of data." Or so.

OK for this change. I will adopt it in the next PR. -- FrancoisBonnarel - 2023-08-14

(7) Sect. 4.8: Sorry, you cannot introduce a utype ("adhoc:this") in a section called "Example: X". If you are really, really sure these "self-describing" things are useful, put them into a section of their own.

Me, I've frankly never really understood where you want to go with this, and I think there's no implementation doing any of this, so perhaps we should drop the whole thing. But if we don't drop it and somehow nonchalantly mention it in an example, at least don't introduce a new utype here. What's wrong with the name="this" you had before? You see, having two different mechanisms for what to my knowledge hasn't been implemented even once seems a bit excessive.

When dropping adhoc:this, don't forget that it is referenced in sect 4.1.

The autodescription motivation may be explained earlier in the section. For "adhoc:this" I remember Pat advocating for this. If we motivate earlier then we can restrict to a pure example here. FrancoisBonnarel - 2023-08-14

(8) I have not looked at the DataLinkImp source that's also present in the repo. If you think this ought to become a document, please extract it to a different repo; ivoatex is not designed to support two documents in one repo.

You are right. The note repo will be created in github.com/ivoa. -- FrancoisBonnarel - 2023-08-18

I've also collected a few rather editorial changes in https://github.com/ivoa-std/DataLink/pull/108



Comments from TCG member during the RFC/TCG Review Period: 2023-04-21 - 2023-05-18

WG chairs or vice chairs must read the Document, provide comments if any (including on topics not directly linked to the Group matters) or indicate that they have no comment.

IG chairs or vice chairs are also encouraged to do the same, althought their inputs are not compulsory.

TCG Chair & Vice Chair

Applications Working Group

No comment on the document, we appreciate the presence of examples that clarify the usage and implementation

--
Datalink is used and usage will increase for external webservice like simulated data, output format that are not in IVOA (Hapi Timeseries, OGC format ...)

May be change the datalink page with examples of implementation
refer to the datalink page in the document.
encourage working/interest groups to put examples as Markus did

Data Access Layer Working Group

Some minor edits only, otherwise this update looks sound.

  1. Section references would be useful in the v1.1 changes list - PR#105 raised and merged
  2. Some minor grammar updates - PR #107 raised

-- JamesDempsey - 2023-07-03

Data Model Working Group

I have a rather rudimentary understanding of DataLink, VOSI and DALI, so there are some details that I'm glossing over in my read.

I don't see any real issues/conflicts with the DM group work. However, I have 2 points/questions to raise:

  1. local_semantics: This is an identifier from a local vocabulary to help identify/select rows at a finer level than possible with just the other tags (semantics, content_type, content_qualifier). I'm guessing this is for something like ObsCore 's dataproduct_subtype. My question is that I don't really understand what the value is... is it just the tag? or URI for the local vocabulary + tag? The example serializations are no help since the DaCHs ones seem to resolve into a pretty format and I can't see the actual datalink content, and the CADC examples don't populate this field. I'd like to see, either in the document or examples, something more concrete.
I don't think authors discussed this point too much. IMHO both status would be acceptable. simple terms are enough to associate the results, but local vocab URI + tag allow to link the term to definitions and relationships, so it's reacher. Examples wil be given in the text. FrancoisBonnarel - 2023-08-18
  1. Product Type vocabulary: This directly affects the DM group, it'd be used in the Dataset model and ObsCore could be updated to use it as well. The link in the standard resolves to a 2021 version of the vocabulary. At the interop, a 2023 version was discussed which looked like it had some issues. Which vocabulary would support this REC?
    1. 2023-07-28: The referenced vocabulary now resolves to a version dated 2023-06-26 (though the event-list discussion was just going on this week). The elements and definitions in this list appear compatible with DM group usage in the ObsCore and Dataset models. -- MarkCresitelloDittmar - 2023-07-28
Yes, the product-type ivoa vocabulary is what should be used from now onwards in dataset DM, next version of ObsCore as well as DataLink content_qualifier or maybe also registry standards. -- FrancoisBonnarel - 2023-08-18

-- MarkCresitelloDittmar - 2023-06-21

Grid & Web Services Working Group

Possible backward compatibility drawbacks in VOSpace (VOSpace implementation can use a DataLink to reference data location):

  • new columns of VOTable content_qualifier, local_semantics, link_auth and link_authorized (pgg. 15, 16) could break backward compatibility.
  • pg. 17 it is stated "From version 1.1 of this standard the {links} response must include this INFO ....
  • pg.24 in "Example service descriptor for VOSpace 2.0, attributes "datatype" and "arraysize" are added to <PARAM>

Registry Working Group

No particular remark pertaining to Registry standards.

Semantics Working Group

Added:
>
>
No issue for Semantics at this point.
 

Data Curation & Preservation Interest Group

Education Interest Group

Knowledge Discovery Interest Group

Operations Interest Group

I don't strictly speaking speak for Operations IG as of this week, but since I did most of the review before my term expired, I'll fill it in here; the TCG can decide whether this counts as an Ops endorsement or not.

As one of the authors I'm basically happy with this document, but I will draw attention to one or two issues.

  • Section 2.2 defines the standardID for this standard as ivo://ivoa.net/std/DataLink#links-1.0, followed by the comment "Note this is applicable to endpoints following any version 1.* of the DataLink standard, to avoid backward compatibility problems." In my opinion the backward compatibility problems are not sufficient to justify this choice, and the minor version should be reflected in this standard ID, i.e. it should be "...#links-1.1". This has been discussed in the open github Issue #96, and other authors seem to agree. A fix will require at least an update to the StandardsRegExt record, and also changes in the document to places where the key is referenced, including Section 2.2 and, especially, Section 3.3.1 as well as related example text. This change would amongst other things make it possible for validators to check which minor version they are supposed to be validating against. PatDowler has volunteered to write a Pull Request addressing this issue, but I can have a go if he doesn't.
  • Section 3.3.1 REQUIRES an INFO defining a suitable standardID for links response tables. The example shows such an INFO element as a child of the RESOURCE/@type="results" element, but it's not clear what restrictions there are on the location - does it have to go there, or can it be elsewhere in the VOTable? This should be clarified. If it's not required to be a child of the results resource, the example text in this section should probably be cut down.
This is done in consistency with DALI. DALI seems to insist that INFO elements should be in the primary RESOURCE (name="results"), and that other RESOURCEs may be in the VOTable. We may be more explicit on this. see next PR -- FrancoisBonnarel - 2023-08-18
  • Section 3.2.2: The final sentence says "No other additional parameters or client handling are allowed." I don't understand what is meant here. Should this sentence be removed?
See my answer to Markus above. And next PR. -- FrancoisBonnarel - 2023-08-18
  • As mentioned in Issue #82, the recommended MIME type application/x-votable+xml;content=datalink is using a content-type parameter for VOTable not endorsed in the VOTable standard, which is a bit questionable; but this is not new in this version of DataLink and it will hopefully be addressed in VOTable 1.5, so turning a blind eye is probably OK.

-- MarkTaylor - 2023-05-15

Radio Astronomy Interest Group

Solar System Interest Group

Theory Interest Group

Time Domain Interest Group

Just one suggestion, perhaps it would be a good idea to explain the notation with braces (e.g. {link}) in the "Conformance-related definitions" section (page 3).

PierreFernique - 2024-11-08

Standards and Processes Committee


TCG Vote : 2023-05-19 - 2023-06-01

If you have minor comments (typos) on the last version of the document please indicate it in the Comments column of the table and post them in the TCG comments section above with the date.

Group Yes No Abstain Comments
TCG        
Apps *      
DAL *      
DM        
GWS        
Registry *      
Changed:
<
<
Semantics        
>
>
Semantics *      
 
DCP *      
Edu        
KDIG *      
Ops *      
Radio        
SSIG        
Theory        
TD *      
<nop>StdProc        



<!--
* Set ALLOWTOPICRENAME = TWikiAdminGroup
-->

META FILEATTACHMENT attachment="AladinDesktopDataLink1-1.png" attr="" comment="" date="1681920946" name="AladinDesktopDataLink1-1.png" path="AladinDesktopDataLink1-1.png" size="1238442" user="FrancoisBonnarel" version="1"

Revision 272023-11-10 - RenaudSavalle

 
META TOPICPARENT name="IvoaTCG"

DataLink 1.1 Proposed Recommendation: Request for Comments


Introduction


DataLink describes the linking of data discovery metadata to access to the data itself, further detailed metadata, related resources, and to services that perform operations on the data.

The main changes in v1.1 are

  • Generalize by adding use cases for links to content other than data files
  • VOSI-availability and VOSI-capabilities endpoints are now optional
  • Service descriptors can include exampleURL and contentType param(s), as well as DESCRIPTION, name, etc...
  • Added optional link_auth and link_authorized to signal whether authentication is necessary to use the link
  • INFO element with standardID mandatory in {links} response
  • Added content_qualifier FIELD to inform on the nature of the link target
  • Added local_semantics to identify similar links in the same DataLink service for different IDs
  • Mechanisms to recognize {links} endpoints outside ObsCore

Latest version of DataLink can be found at:

The GitHub repository for issues and source can be found at: Detailed discussion towards 1-1 can also be found on this ivoa twiki page ( last update by FrancoisBonnarel - 2023-05-04):

Reference Interoperable Implementations

Server side

GAVO implemented the following changes = content_qualifier and local_semantics, service descriptor additional features such as DESCRIPTION? name, content_type, etc....

As a matter of example, a couple of links response for various obscore/sia/ssa tables in GAVO server

All these are combined with various semantics or content_type values

-- MarkusDemleitner - 2023-07-10


CADC has implemented the following in https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops:

  • INFO element with standardID in links response
  • new optional fields in links response: local_semantics (no content yet but can be populated with default vocab in most cases), content_qualifier (no content, not likely to use), link_auth, link_authorized
  • contentType param in service descriptors (where applicable)
IRIS image: https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/IRIS?f212h000/IRAS-25um shows link_auth=optional and link_authorized=true because one can authenticate but the data is public.

new CFHT data: anonymous use of https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/CFHT?2773629/2773629o shows link_auth=optional and link_authorized=false because the data is still proprietary and the caller is anonymous; if an authorized user makes the call they will see authorized=true. It's hard to demonstrate that for a general audience.

The core CADC implementation is available as a library (cadc-datalink-server) in MavenCentral with source code at https://github.com/opencadc/dal.git; the caom2-specific logic is available in a library (caom2-datalink-server) with source at https://github.com/opencadc/caom2service.git -- the core lib is also used in ALMA DataLink service but may not yet be released with the latest features.

Client side

TOPCAT v4.8-8 and later displays additional features in service descriptor and makes use of additional links table FIELD such as content_qualifier, authorization and local_semantics; The tool behavior is adapted to the content of these new FIELDS. For example Activation Actions suggested by topcat for the links not only depend on content_type but also content_qualifier, and local_semantics is used to guess which link a user is interested in based on previous selections.

AladinDesktop is going to adapt to those new features too. (see prototype screenshot)

The CADC DownloadManager (https://github.com/opencadc/apps.git) includes a simple DataLink client class so it can resolve publisherID values into 1..* URLs for download; this code hasn't changed as a result of DataLink -1.1. The CADC AdvancedSearch web portal makes calls to the above caom2ops/datalink service to find previews and download info for each row (publisherID): it makes use of link_authorized to decide to display the download options (or not), which prevents users from selecting downloads/links when they are not authorized and the request will be rejected later.

Implementations Validators

The following validators are available for DataLink

  • datalinklint which is part of STILTS. From STILTS v3.4-8 there is a version parameter that can be set to 1.0 or 1.1 (effectively defaults to v1.1)
  • Show my DataLink which is part of DaCHS



Comments from the IVOA Community during RFC/TCG review period: 2023-04-21 - 2023-05-18

The comments from the TCG members during the RFC/TCG review should be included in the next section.

In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your Wiki Name so that authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.

Additional discussion about any of the comments or responses can be conducted on the WG mailing list. However, please be sure to enter your initial comments here for full consideration in any future revisions of this document

Community Comment by Markus Demleitner

(1) The standard ID: I'm pretty sure we discussed that before, but I'm really unsure how we came to the conclusion that even Datalink 1.1 still has the ivoid of ivo://ivoa.net/std/DataLink#links-1.0.

Yes, we have to do crazy stuff like that for the schema URIs due to the way XML element names are compared. But there is in general no analogous need with ivoids, because we control the rules how to compare them in what situations.

Does anyone remember why we went for links-1.0 here? If not, I'd suggest links-1. I volunteer for adding a brief explanation about how clients should disregard the minor version for normal operations.

(2) I am entirely unhappy with section 3.1.1, starting with its title, which probably should be something like "Datalinks in VOTable columns". And then the first paragraph should probably say something more concrete like perhaps "Columns containing datalinks SHOULD be marked with a UCD of X.Y.Z and a LINK-typed child in its FIELD like this:

   <LINK whatever="blabla"/>"

And the second paragraph I'd say doesn't belong here at all (it could go to, perhaps, 1.2.7 or a use case discussing datalinks as primary results if we think we need to be explicit about this).

There are use cases behind this. When datalink links response is hooked to table rows outside the context of ObsTAP /SIA2 how do we generate/recognize the DataLink URL ?

Of course we can use the Service Descriptor with the single ID parameter if the DataLink can be parametrized by and "id" from one of the columns. But in that case the descriptor would be doing exactly the same than the LINK element proposed here as included in the appropriate FIELD and is much less verbose. And it's pretty correct VOTable standard. The FIELD itself should not be described by a datalink ucd because it's probably generally an id.

The second paragraph refers to use cases where the URL is not built from the content of one FIELD and when the URL is ad hoc and should be the content of a FIELD. Using the same utypes than the one used in Obscore responses seems reasonable. This is for example adapted to SIA1 or SSA responses. I think this has nothing to do with recursive datalink.

We may try to rephrase all this if this is unclear, but the intent has to be kept.

-- FrancoisBonnarel - 2023-08-12

(3) In 3.2, is says:

If an error occurs while processing an ID value, there \rfcshould\ be at least one row for that ID value and an error\_message

The way the pyvo datalink client is written, we have to make that an unconditional MUST, or pyvo will keep requesting any failing ID (and frankly I'm unsure how else to implement this given multi-ID and overflows): it will only remove an ID of its list of ids to query if it gets at least one row back for it. Perhaps:

A service MUST return at least one row for each ID passed in.

[ceterum censeo we should have let ID be single-valued; it would have made everything soo much simpler and nothing really much harder/slower]

(4) 3.2.2, second paragraph: I had to puzzle quite a bit about this, starting with wondering what a "dereferenceable URL" might be. I'd suggest to replace the entire paragraph with "Access URLs may have fragment parts, which could, for instance, refer to id-ed elements within XML documents or extensions within FITS files. As in URIs in general, the interpretation of a fragment identifier depends on the media type."

"dereferanceable" was used in the sense that it can be fully accessed by http. Which is not the case in URN in general or URL with fragments. For the latter the client is supposed to interpret the fragment. See: https://en.wikipedia.org/wiki/URI_fragment

Apart from that I agree with your rephrasing. FrancoisBonnarel - 2023-08-14"

This will also drop the "No other additional parameters or client handling are allowed." -- if this forbids query strings on access URIs, I'd strongly disagree. If this means something else, we'd have to write that something else.

In version 1.0 we could read:

"The access_url column contains a URL to download a single resource. The URL in this column must be usable as-is with no additional parameters or client handling; it can be a link to a dynamic resource (e.g. preview generation)."

This statement was not consistent with the allowance of fragments, hence the new statement. I can rephrase it in the upcoming PR. -- FrancoisBonnarel - 2023-08-14

(5) In my editoral PR, I've dropped a paragraph on semantics for error_message rows. This is now sufficiently addressed above that passage.

(6) sect. 3.2.9 content_qualifier: I think we should at least name the motivating use case a bit more precisely here, as in, perhaps: "It aids clients in presenting to the user the same sort of link as they go from one dataset to another within a service. For instance, suppose a service serves both continuum and line cubes. Using content_qualifier, users can configure their clients such that, as they change to a new data set, they always see the line cube even when the semantics and content\_type columns agree for both types of data." Or so.

OK for this change. I will adopt it in the next PR. -- FrancoisBonnarel - 2023-08-14

(7) Sect. 4.8: Sorry, you cannot introduce a utype ("adhoc:this") in a section called "Example: X". If you are really, really sure these "self-describing" things are useful, put them into a section of their own.

Me, I've frankly never really understood where you want to go with this, and I think there's no implementation doing any of this, so perhaps we should drop the whole thing. But if we don't drop it and somehow nonchalantly mention it in an example, at least don't introduce a new utype here. What's wrong with the name="this" you had before? You see, having two different mechanisms for what to my knowledge hasn't been implemented even once seems a bit excessive.

When dropping adhoc:this, don't forget that it is referenced in sect 4.1.

The autodescription motivation may be explained earlier in the section. For "adhoc:this" I remember Pat advocating for this. If we motivate earlier then we can restrict to a pure example here. FrancoisBonnarel - 2023-08-14

(8) I have not looked at the DataLinkImp source that's also present in the repo. If you think this ought to become a document, please extract it to a different repo; ivoatex is not designed to support two documents in one repo.

You are right. The note repo will be created in github.com/ivoa. -- FrancoisBonnarel - 2023-08-18

I've also collected a few rather editorial changes in https://github.com/ivoa-std/DataLink/pull/108



Comments from TCG member during the RFC/TCG Review Period: 2023-04-21 - 2023-05-18

WG chairs or vice chairs must read the Document, provide comments if any (including on topics not directly linked to the Group matters) or indicate that they have no comment.

IG chairs or vice chairs are also encouraged to do the same, althought their inputs are not compulsory.

TCG Chair & Vice Chair

Applications Working Group

No comment on the document, we appreciate the presence of examples that clarify the usage and implementation

--
Datalink is used and usage will increase for external webservice like simulated data, output format that are not in IVOA (Hapi Timeseries, OGC format ...)

May be change the datalink page with examples of implementation
refer to the datalink page in the document.
encourage working/interest groups to put examples as Markus did

Data Access Layer Working Group

Some minor edits only, otherwise this update looks sound.

  1. Section references would be useful in the v1.1 changes list - PR#105 raised and merged
  2. Some minor grammar updates - PR #107 raised

-- JamesDempsey - 2023-07-03

Data Model Working Group

I have a rather rudimentary understanding of DataLink, VOSI and DALI, so there are some details that I'm glossing over in my read.

I don't see any real issues/conflicts with the DM group work. However, I have 2 points/questions to raise:

  1. local_semantics: This is an identifier from a local vocabulary to help identify/select rows at a finer level than possible with just the other tags (semantics, content_type, content_qualifier). I'm guessing this is for something like ObsCore 's dataproduct_subtype. My question is that I don't really understand what the value is... is it just the tag? or URI for the local vocabulary + tag? The example serializations are no help since the DaCHs ones seem to resolve into a pretty format and I can't see the actual datalink content, and the CADC examples don't populate this field. I'd like to see, either in the document or examples, something more concrete.
I don't think authors discussed this point too much. IMHO both status would be acceptable. simple terms are enough to associate the results, but local vocab URI + tag allow to link the term to definitions and relationships, so it's reacher. Examples wil be given in the text. FrancoisBonnarel - 2023-08-18
  1. Product Type vocabulary: This directly affects the DM group, it'd be used in the Dataset model and ObsCore could be updated to use it as well. The link in the standard resolves to a 2021 version of the vocabulary. At the interop, a 2023 version was discussed which looked like it had some issues. Which vocabulary would support this REC?
    1. 2023-07-28: The referenced vocabulary now resolves to a version dated 2023-06-26 (though the event-list discussion was just going on this week). The elements and definitions in this list appear compatible with DM group usage in the ObsCore and Dataset models. -- MarkCresitelloDittmar - 2023-07-28
Yes, the product-type ivoa vocabulary is what should be used from now onwards in dataset DM, next version of ObsCore as well as DataLink content_qualifier or maybe also registry standards. -- FrancoisBonnarel - 2023-08-18

-- MarkCresitelloDittmar - 2023-06-21

Grid & Web Services Working Group

Possible backward compatibility drawbacks in VOSpace (VOSpace implementation can use a DataLink to reference data location):

  • new columns of VOTable content_qualifier, local_semantics, link_auth and link_authorized (pgg. 15, 16) could break backward compatibility.
  • pg. 17 it is stated "From version 1.1 of this standard the {links} response must include this INFO ....
  • pg.24 in "Example service descriptor for VOSpace 2.0, attributes "datatype" and "arraysize" are added to <PARAM>

Registry Working Group

Added:
>
>
No particular remark pertaining to Registry standards.
 

Semantics Working Group

Data Curation & Preservation Interest Group

Education Interest Group

Knowledge Discovery Interest Group

Operations Interest Group

I don't strictly speaking speak for Operations IG as of this week, but since I did most of the review before my term expired, I'll fill it in here; the TCG can decide whether this counts as an Ops endorsement or not.

As one of the authors I'm basically happy with this document, but I will draw attention to one or two issues.

  • Section 2.2 defines the standardID for this standard as ivo://ivoa.net/std/DataLink#links-1.0, followed by the comment "Note this is applicable to endpoints following any version 1.* of the DataLink standard, to avoid backward compatibility problems." In my opinion the backward compatibility problems are not sufficient to justify this choice, and the minor version should be reflected in this standard ID, i.e. it should be "...#links-1.1". This has been discussed in the open github Issue #96, and other authors seem to agree. A fix will require at least an update to the StandardsRegExt record, and also changes in the document to places where the key is referenced, including Section 2.2 and, especially, Section 3.3.1 as well as related example text. This change would amongst other things make it possible for validators to check which minor version they are supposed to be validating against. PatDowler has volunteered to write a Pull Request addressing this issue, but I can have a go if he doesn't.
  • Section 3.3.1 REQUIRES an INFO defining a suitable standardID for links response tables. The example shows such an INFO element as a child of the RESOURCE/@type="results" element, but it's not clear what restrictions there are on the location - does it have to go there, or can it be elsewhere in the VOTable? This should be clarified. If it's not required to be a child of the results resource, the example text in this section should probably be cut down.
This is done in consistency with DALI. DALI seems to insist that INFO elements should be in the primary RESOURCE (name="results"), and that other RESOURCEs may be in the VOTable. We may be more explicit on this. see next PR -- FrancoisBonnarel - 2023-08-18
  • Section 3.2.2: The final sentence says "No other additional parameters or client handling are allowed." I don't understand what is meant here. Should this sentence be removed?
See my answer to Markus above. And next PR. -- FrancoisBonnarel - 2023-08-18
  • As mentioned in Issue #82, the recommended MIME type application/x-votable+xml;content=datalink is using a content-type parameter for VOTable not endorsed in the VOTable standard, which is a bit questionable; but this is not new in this version of DataLink and it will hopefully be addressed in VOTable 1.5, so turning a blind eye is probably OK.

-- MarkTaylor - 2023-05-15

Radio Astronomy Interest Group

Solar System Interest Group

Theory Interest Group

Time Domain Interest Group

Just one suggestion, perhaps it would be a good idea to explain the notation with braces (e.g. {link}) in the "Conformance-related definitions" section (page 3).

PierreFernique - 2024-11-08

Standards and Processes Committee


TCG Vote : 2023-05-19 - 2023-06-01

If you have minor comments (typos) on the last version of the document please indicate it in the Comments column of the table and post them in the TCG comments section above with the date.

Group Yes No Abstain Comments
TCG        
Apps *      
DAL *      
DM        
GWS        
Changed:
<
<
Registry        
>
>
Registry *      
 
Semantics        
DCP *      
Edu        
KDIG *      
Ops *      
Radio        
SSIG        
Theory        
TD *      
<nop>StdProc        



<!--
* Set ALLOWTOPICRENAME = TWikiAdminGroup
-->

META FILEATTACHMENT attachment="AladinDesktopDataLink1-1.png" attr="" comment="" date="1681920946" name="AladinDesktopDataLink1-1.png" path="AladinDesktopDataLink1-1.png" size="1238442" user="FrancoisBonnarel" version="1"

Revision 262023-11-10 - SaraBertocco

 
META TOPICPARENT name="IvoaTCG"

DataLink 1.1 Proposed Recommendation: Request for Comments


Introduction


DataLink describes the linking of data discovery metadata to access to the data itself, further detailed metadata, related resources, and to services that perform operations on the data.

The main changes in v1.1 are

  • Generalize by adding use cases for links to content other than data files
  • VOSI-availability and VOSI-capabilities endpoints are now optional
  • Service descriptors can include exampleURL and contentType param(s), as well as DESCRIPTION, name, etc...
  • Added optional link_auth and link_authorized to signal whether authentication is necessary to use the link
  • INFO element with standardID mandatory in {links} response
  • Added content_qualifier FIELD to inform on the nature of the link target
  • Added local_semantics to identify similar links in the same DataLink service for different IDs
  • Mechanisms to recognize {links} endpoints outside ObsCore

Latest version of DataLink can be found at:

The GitHub repository for issues and source can be found at: Detailed discussion towards 1-1 can also be found on this ivoa twiki page ( last update by FrancoisBonnarel - 2023-05-04):

Reference Interoperable Implementations

Server side

GAVO implemented the following changes = content_qualifier and local_semantics, service descriptor additional features such as DESCRIPTION? name, content_type, etc....

As a matter of example, a couple of links response for various obscore/sia/ssa tables in GAVO server

All these are combined with various semantics or content_type values

-- MarkusDemleitner - 2023-07-10


CADC has implemented the following in https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops:

  • INFO element with standardID in links response
  • new optional fields in links response: local_semantics (no content yet but can be populated with default vocab in most cases), content_qualifier (no content, not likely to use), link_auth, link_authorized
  • contentType param in service descriptors (where applicable)
IRIS image: https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/IRIS?f212h000/IRAS-25um shows link_auth=optional and link_authorized=true because one can authenticate but the data is public.

new CFHT data: anonymous use of https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/CFHT?2773629/2773629o shows link_auth=optional and link_authorized=false because the data is still proprietary and the caller is anonymous; if an authorized user makes the call they will see authorized=true. It's hard to demonstrate that for a general audience.

The core CADC implementation is available as a library (cadc-datalink-server) in MavenCentral with source code at https://github.com/opencadc/dal.git; the caom2-specific logic is available in a library (caom2-datalink-server) with source at https://github.com/opencadc/caom2service.git -- the core lib is also used in ALMA DataLink service but may not yet be released with the latest features.

Client side

TOPCAT v4.8-8 and later displays additional features in service descriptor and makes use of additional links table FIELD such as content_qualifier, authorization and local_semantics; The tool behavior is adapted to the content of these new FIELDS. For example Activation Actions suggested by topcat for the links not only depend on content_type but also content_qualifier, and local_semantics is used to guess which link a user is interested in based on previous selections.

AladinDesktop is going to adapt to those new features too. (see prototype screenshot)

The CADC DownloadManager (https://github.com/opencadc/apps.git) includes a simple DataLink client class so it can resolve publisherID values into 1..* URLs for download; this code hasn't changed as a result of DataLink -1.1. The CADC AdvancedSearch web portal makes calls to the above caom2ops/datalink service to find previews and download info for each row (publisherID): it makes use of link_authorized to decide to display the download options (or not), which prevents users from selecting downloads/links when they are not authorized and the request will be rejected later.

Implementations Validators

The following validators are available for DataLink

  • datalinklint which is part of STILTS. From STILTS v3.4-8 there is a version parameter that can be set to 1.0 or 1.1 (effectively defaults to v1.1)
  • Show my DataLink which is part of DaCHS



Comments from the IVOA Community during RFC/TCG review period: 2023-04-21 - 2023-05-18

The comments from the TCG members during the RFC/TCG review should be included in the next section.

In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your Wiki Name so that authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.

Additional discussion about any of the comments or responses can be conducted on the WG mailing list. However, please be sure to enter your initial comments here for full consideration in any future revisions of this document

Community Comment by Markus Demleitner

(1) The standard ID: I'm pretty sure we discussed that before, but I'm really unsure how we came to the conclusion that even Datalink 1.1 still has the ivoid of ivo://ivoa.net/std/DataLink#links-1.0.

Yes, we have to do crazy stuff like that for the schema URIs due to the way XML element names are compared. But there is in general no analogous need with ivoids, because we control the rules how to compare them in what situations.

Does anyone remember why we went for links-1.0 here? If not, I'd suggest links-1. I volunteer for adding a brief explanation about how clients should disregard the minor version for normal operations.

(2) I am entirely unhappy with section 3.1.1, starting with its title, which probably should be something like "Datalinks in VOTable columns". And then the first paragraph should probably say something more concrete like perhaps "Columns containing datalinks SHOULD be marked with a UCD of X.Y.Z and a LINK-typed child in its FIELD like this:

   <LINK whatever="blabla"/>"

And the second paragraph I'd say doesn't belong here at all (it could go to, perhaps, 1.2.7 or a use case discussing datalinks as primary results if we think we need to be explicit about this).

There are use cases behind this. When datalink links response is hooked to table rows outside the context of ObsTAP /SIA2 how do we generate/recognize the DataLink URL ?

Of course we can use the Service Descriptor with the single ID parameter if the DataLink can be parametrized by and "id" from one of the columns. But in that case the descriptor would be doing exactly the same than the LINK element proposed here as included in the appropriate FIELD and is much less verbose. And it's pretty correct VOTable standard. The FIELD itself should not be described by a datalink ucd because it's probably generally an id.

The second paragraph refers to use cases where the URL is not built from the content of one FIELD and when the URL is ad hoc and should be the content of a FIELD. Using the same utypes than the one used in Obscore responses seems reasonable. This is for example adapted to SIA1 or SSA responses. I think this has nothing to do with recursive datalink.

We may try to rephrase all this if this is unclear, but the intent has to be kept.

-- FrancoisBonnarel - 2023-08-12

(3) In 3.2, is says:

If an error occurs while processing an ID value, there \rfcshould\ be at least one row for that ID value and an error\_message

The way the pyvo datalink client is written, we have to make that an unconditional MUST, or pyvo will keep requesting any failing ID (and frankly I'm unsure how else to implement this given multi-ID and overflows): it will only remove an ID of its list of ids to query if it gets at least one row back for it. Perhaps:

A service MUST return at least one row for each ID passed in.

[ceterum censeo we should have let ID be single-valued; it would have made everything soo much simpler and nothing really much harder/slower]

(4) 3.2.2, second paragraph: I had to puzzle quite a bit about this, starting with wondering what a "dereferenceable URL" might be. I'd suggest to replace the entire paragraph with "Access URLs may have fragment parts, which could, for instance, refer to id-ed elements within XML documents or extensions within FITS files. As in URIs in general, the interpretation of a fragment identifier depends on the media type."

"dereferanceable" was used in the sense that it can be fully accessed by http. Which is not the case in URN in general or URL with fragments. For the latter the client is supposed to interpret the fragment. See: https://en.wikipedia.org/wiki/URI_fragment

Apart from that I agree with your rephrasing. FrancoisBonnarel - 2023-08-14"

This will also drop the "No other additional parameters or client handling are allowed." -- if this forbids query strings on access URIs, I'd strongly disagree. If this means something else, we'd have to write that something else.

In version 1.0 we could read:

"The access_url column contains a URL to download a single resource. The URL in this column must be usable as-is with no additional parameters or client handling; it can be a link to a dynamic resource (e.g. preview generation)."

This statement was not consistent with the allowance of fragments, hence the new statement. I can rephrase it in the upcoming PR. -- FrancoisBonnarel - 2023-08-14

(5) In my editoral PR, I've dropped a paragraph on semantics for error_message rows. This is now sufficiently addressed above that passage.

(6) sect. 3.2.9 content_qualifier: I think we should at least name the motivating use case a bit more precisely here, as in, perhaps: "It aids clients in presenting to the user the same sort of link as they go from one dataset to another within a service. For instance, suppose a service serves both continuum and line cubes. Using content_qualifier, users can configure their clients such that, as they change to a new data set, they always see the line cube even when the semantics and content\_type columns agree for both types of data." Or so.

OK for this change. I will adopt it in the next PR. -- FrancoisBonnarel - 2023-08-14

(7) Sect. 4.8: Sorry, you cannot introduce a utype ("adhoc:this") in a section called "Example: X". If you are really, really sure these "self-describing" things are useful, put them into a section of their own.

Me, I've frankly never really understood where you want to go with this, and I think there's no implementation doing any of this, so perhaps we should drop the whole thing. But if we don't drop it and somehow nonchalantly mention it in an example, at least don't introduce a new utype here. What's wrong with the name="this" you had before? You see, having two different mechanisms for what to my knowledge hasn't been implemented even once seems a bit excessive.

When dropping adhoc:this, don't forget that it is referenced in sect 4.1.

The autodescription motivation may be explained earlier in the section. For "adhoc:this" I remember Pat advocating for this. If we motivate earlier then we can restrict to a pure example here. FrancoisBonnarel - 2023-08-14

(8) I have not looked at the DataLinkImp source that's also present in the repo. If you think this ought to become a document, please extract it to a different repo; ivoatex is not designed to support two documents in one repo.

You are right. The note repo will be created in github.com/ivoa. -- FrancoisBonnarel - 2023-08-18

I've also collected a few rather editorial changes in https://github.com/ivoa-std/DataLink/pull/108



Comments from TCG member during the RFC/TCG Review Period: 2023-04-21 - 2023-05-18

WG chairs or vice chairs must read the Document, provide comments if any (including on topics not directly linked to the Group matters) or indicate that they have no comment.

IG chairs or vice chairs are also encouraged to do the same, althought their inputs are not compulsory.

TCG Chair & Vice Chair

Applications Working Group

No comment on the document, we appreciate the presence of examples that clarify the usage and implementation

--
Datalink is used and usage will increase for external webservice like simulated data, output format that are not in IVOA (Hapi Timeseries, OGC format ...)

May be change the datalink page with examples of implementation
refer to the datalink page in the document.
encourage working/interest groups to put examples as Markus did

Data Access Layer Working Group

Some minor edits only, otherwise this update looks sound.

  1. Section references would be useful in the v1.1 changes list - PR#105 raised and merged
  2. Some minor grammar updates - PR #107 raised

-- JamesDempsey - 2023-07-03

Data Model Working Group

I have a rather rudimentary understanding of DataLink, VOSI and DALI, so there are some details that I'm glossing over in my read.

I don't see any real issues/conflicts with the DM group work. However, I have 2 points/questions to raise:

  1. local_semantics: This is an identifier from a local vocabulary to help identify/select rows at a finer level than possible with just the other tags (semantics, content_type, content_qualifier). I'm guessing this is for something like ObsCore 's dataproduct_subtype. My question is that I don't really understand what the value is... is it just the tag? or URI for the local vocabulary + tag? The example serializations are no help since the DaCHs ones seem to resolve into a pretty format and I can't see the actual datalink content, and the CADC examples don't populate this field. I'd like to see, either in the document or examples, something more concrete.
I don't think authors discussed this point too much. IMHO both status would be acceptable. simple terms are enough to associate the results, but local vocab URI + tag allow to link the term to definitions and relationships, so it's reacher. Examples wil be given in the text. FrancoisBonnarel - 2023-08-18
  1. Product Type vocabulary: This directly affects the DM group, it'd be used in the Dataset model and ObsCore could be updated to use it as well. The link in the standard resolves to a 2021 version of the vocabulary. At the interop, a 2023 version was discussed which looked like it had some issues. Which vocabulary would support this REC?
    1. 2023-07-28: The referenced vocabulary now resolves to a version dated 2023-06-26 (though the event-list discussion was just going on this week). The elements and definitions in this list appear compatible with DM group usage in the ObsCore and Dataset models. -- MarkCresitelloDittmar - 2023-07-28
Yes, the product-type ivoa vocabulary is what should be used from now onwards in dataset DM, next version of ObsCore as well as DataLink content_qualifier or maybe also registry standards. -- FrancoisBonnarel - 2023-08-18

-- MarkCresitelloDittmar - 2023-06-21

Grid & Web Services Working Group

Added:
>
>
Possible backward compatibility drawbacks in VOSpace (VOSpace implementation can use a DataLink to reference data location):
  • new columns of VOTable content_qualifier, local_semantics, link_auth and link_authorized (pgg. 15, 16) could break backward compatibility.
  • pg. 17 it is stated "From version 1.1 of this standard the {links} response must include this INFO ....
  • pg.24 in "Example service descriptor for VOSpace 2.0, attributes "datatype" and "arraysize" are added to <PARAM>
 

Registry Working Group

Semantics Working Group

Data Curation & Preservation Interest Group

Education Interest Group

Knowledge Discovery Interest Group

Operations Interest Group

I don't strictly speaking speak for Operations IG as of this week, but since I did most of the review before my term expired, I'll fill it in here; the TCG can decide whether this counts as an Ops endorsement or not.

As one of the authors I'm basically happy with this document, but I will draw attention to one or two issues.

  • Section 2.2 defines the standardID for this standard as ivo://ivoa.net/std/DataLink#links-1.0, followed by the comment "Note this is applicable to endpoints following any version 1.* of the DataLink standard, to avoid backward compatibility problems." In my opinion the backward compatibility problems are not sufficient to justify this choice, and the minor version should be reflected in this standard ID, i.e. it should be "...#links-1.1". This has been discussed in the open github Issue #96, and other authors seem to agree. A fix will require at least an update to the StandardsRegExt record, and also changes in the document to places where the key is referenced, including Section 2.2 and, especially, Section 3.3.1 as well as related example text. This change would amongst other things make it possible for validators to check which minor version they are supposed to be validating against. PatDowler has volunteered to write a Pull Request addressing this issue, but I can have a go if he doesn't.
  • Section 3.3.1 REQUIRES an INFO defining a suitable standardID for links response tables. The example shows such an INFO element as a child of the RESOURCE/@type="results" element, but it's not clear what restrictions there are on the location - does it have to go there, or can it be elsewhere in the VOTable? This should be clarified. If it's not required to be a child of the results resource, the example text in this section should probably be cut down.
This is done in consistency with DALI. DALI seems to insist that INFO elements should be in the primary RESOURCE (name="results"), and that other RESOURCEs may be in the VOTable. We may be more explicit on this. see next PR -- FrancoisBonnarel - 2023-08-18
  • Section 3.2.2: The final sentence says "No other additional parameters or client handling are allowed." I don't understand what is meant here. Should this sentence be removed?
See my answer to Markus above. And next PR. -- FrancoisBonnarel - 2023-08-18
  • As mentioned in Issue #82, the recommended MIME type application/x-votable+xml;content=datalink is using a content-type parameter for VOTable not endorsed in the VOTable standard, which is a bit questionable; but this is not new in this version of DataLink and it will hopefully be addressed in VOTable 1.5, so turning a blind eye is probably OK.

-- MarkTaylor - 2023-05-15

Radio Astronomy Interest Group

Solar System Interest Group

Theory Interest Group

Time Domain Interest Group

Just one suggestion, perhaps it would be a good idea to explain the notation with braces (e.g. {link}) in the "Conformance-related definitions" section (page 3).

PierreFernique - 2024-11-08

Standards and Processes Committee


TCG Vote : 2023-05-19 - 2023-06-01

If you have minor comments (typos) on the last version of the document please indicate it in the Comments column of the table and post them in the TCG comments section above with the date.

Group Yes No Abstain Comments
TCG        
Apps *      
DAL *      
DM        
GWS        
Registry        
Semantics        
DCP *      
Edu        
KDIG *      
Ops *      
Radio        
SSIG        
Theory        
TD *      
<nop>StdProc        



<!--
* Set ALLOWTOPICRENAME = TWikiAdminGroup
-->

META FILEATTACHMENT attachment="AladinDesktopDataLink1-1.png" attr="" comment="" date="1681920946" name="AladinDesktopDataLink1-1.png" path="AladinDesktopDataLink1-1.png" size="1238442" user="FrancoisBonnarel" version="1"

Revision 252023-11-09 - TamaraCivera

 
META TOPICPARENT name="IvoaTCG"

DataLink 1.1 Proposed Recommendation: Request for Comments


Introduction


DataLink describes the linking of data discovery metadata to access to the data itself, further detailed metadata, related resources, and to services that perform operations on the data.

The main changes in v1.1 are

  • Generalize by adding use cases for links to content other than data files
  • VOSI-availability and VOSI-capabilities endpoints are now optional
  • Service descriptors can include exampleURL and contentType param(s), as well as DESCRIPTION, name, etc...
  • Added optional link_auth and link_authorized to signal whether authentication is necessary to use the link
  • INFO element with standardID mandatory in {links} response
  • Added content_qualifier FIELD to inform on the nature of the link target
  • Added local_semantics to identify similar links in the same DataLink service for different IDs
  • Mechanisms to recognize {links} endpoints outside ObsCore

Latest version of DataLink can be found at:

The GitHub repository for issues and source can be found at: Detailed discussion towards 1-1 can also be found on this ivoa twiki page ( last update by FrancoisBonnarel - 2023-05-04):

Reference Interoperable Implementations

Server side

GAVO implemented the following changes = content_qualifier and local_semantics, service descriptor additional features such as DESCRIPTION? name, content_type, etc....

As a matter of example, a couple of links response for various obscore/sia/ssa tables in GAVO server

All these are combined with various semantics or content_type values

-- MarkusDemleitner - 2023-07-10


CADC has implemented the following in https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops:

  • INFO element with standardID in links response
  • new optional fields in links response: local_semantics (no content yet but can be populated with default vocab in most cases), content_qualifier (no content, not likely to use), link_auth, link_authorized
  • contentType param in service descriptors (where applicable)
IRIS image: https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/IRIS?f212h000/IRAS-25um shows link_auth=optional and link_authorized=true because one can authenticate but the data is public.

new CFHT data: anonymous use of https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/CFHT?2773629/2773629o shows link_auth=optional and link_authorized=false because the data is still proprietary and the caller is anonymous; if an authorized user makes the call they will see authorized=true. It's hard to demonstrate that for a general audience.

The core CADC implementation is available as a library (cadc-datalink-server) in MavenCentral with source code at https://github.com/opencadc/dal.git; the caom2-specific logic is available in a library (caom2-datalink-server) with source at https://github.com/opencadc/caom2service.git -- the core lib is also used in ALMA DataLink service but may not yet be released with the latest features.

Client side

TOPCAT v4.8-8 and later displays additional features in service descriptor and makes use of additional links table FIELD such as content_qualifier, authorization and local_semantics; The tool behavior is adapted to the content of these new FIELDS. For example Activation Actions suggested by topcat for the links not only depend on content_type but also content_qualifier, and local_semantics is used to guess which link a user is interested in based on previous selections.

AladinDesktop is going to adapt to those new features too. (see prototype screenshot)

The CADC DownloadManager (https://github.com/opencadc/apps.git) includes a simple DataLink client class so it can resolve publisherID values into 1..* URLs for download; this code hasn't changed as a result of DataLink -1.1. The CADC AdvancedSearch web portal makes calls to the above caom2ops/datalink service to find previews and download info for each row (publisherID): it makes use of link_authorized to decide to display the download options (or not), which prevents users from selecting downloads/links when they are not authorized and the request will be rejected later.

Implementations Validators

The following validators are available for DataLink

  • datalinklint which is part of STILTS. From STILTS v3.4-8 there is a version parameter that can be set to 1.0 or 1.1 (effectively defaults to v1.1)
  • Show my DataLink which is part of DaCHS



Comments from the IVOA Community during RFC/TCG review period: 2023-04-21 - 2023-05-18

The comments from the TCG members during the RFC/TCG review should be included in the next section.

In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your Wiki Name so that authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.

Additional discussion about any of the comments or responses can be conducted on the WG mailing list. However, please be sure to enter your initial comments here for full consideration in any future revisions of this document

Community Comment by Markus Demleitner

(1) The standard ID: I'm pretty sure we discussed that before, but I'm really unsure how we came to the conclusion that even Datalink 1.1 still has the ivoid of ivo://ivoa.net/std/DataLink#links-1.0.

Yes, we have to do crazy stuff like that for the schema URIs due to the way XML element names are compared. But there is in general no analogous need with ivoids, because we control the rules how to compare them in what situations.

Does anyone remember why we went for links-1.0 here? If not, I'd suggest links-1. I volunteer for adding a brief explanation about how clients should disregard the minor version for normal operations.

(2) I am entirely unhappy with section 3.1.1, starting with its title, which probably should be something like "Datalinks in VOTable columns". And then the first paragraph should probably say something more concrete like perhaps "Columns containing datalinks SHOULD be marked with a UCD of X.Y.Z and a LINK-typed child in its FIELD like this:

Deleted:
<
<
 
   <LINK whatever="blabla"/>"

And the second paragraph I'd say doesn't belong here at all (it could go to, perhaps, 1.2.7 or a use case discussing datalinks as primary results if we think we need to be explicit about this).

There are use cases behind this. When datalink links response is hooked to table rows outside the context of ObsTAP /SIA2 how do we generate/recognize the DataLink URL ?

Of course we can use the Service Descriptor with the single ID parameter if the DataLink can be parametrized by and "id" from one of the columns. But in that case the descriptor would be doing exactly the same than the LINK element proposed here as included in the appropriate FIELD and is much less verbose. And it's pretty correct VOTable standard. The FIELD itself should not be described by a datalink ucd because it's probably generally an id.

The second paragraph refers to use cases where the URL is not built from the content of one FIELD and when the URL is ad hoc and should be the content of a FIELD. Using the same utypes than the one used in Obscore responses seems reasonable. This is for example adapted to SIA1 or SSA responses. I think this has nothing to do with recursive datalink.

We may try to rephrase all this if this is unclear, but the intent has to be kept.

-- FrancoisBonnarel - 2023-08-12

(3) In 3.2, is says:

If an error occurs while processing an ID value, there \rfcshould\ be at least one row for that ID value and an error\_message

The way the pyvo datalink client is written, we have to make that an unconditional MUST, or pyvo will keep requesting any failing ID (and frankly I'm unsure how else to implement this given multi-ID and overflows): it will only remove an ID of its list of ids to query if it gets at least one row back for it. Perhaps:

A service MUST return at least one row for each ID passed in.

[ceterum censeo we should have let ID be single-valued; it would have made everything soo much simpler and nothing really much harder/slower]

(4) 3.2.2, second paragraph: I had to puzzle quite a bit about this, starting with wondering what a "dereferenceable URL" might be. I'd suggest to replace the entire paragraph with "Access URLs may have fragment parts, which could, for instance, refer to id-ed elements within XML documents or extensions within FITS files. As in URIs in general, the interpretation of a fragment identifier depends on the media type."

"dereferanceable" was used in the sense that it can be fully accessed by http. Which is not the case in URN in general or URL with fragments. For the latter the client is supposed to interpret the fragment. See: https://en.wikipedia.org/wiki/URI_fragment

Apart from that I agree with your rephrasing. FrancoisBonnarel - 2023-08-14"

This will also drop the "No other additional parameters or client handling are allowed." -- if this forbids query strings on access URIs, I'd strongly disagree. If this means something else, we'd have to write that something else.

In version 1.0 we could read:

"The access_url column contains a URL to download a single resource. The URL in this column must be usable as-is with no additional parameters or client handling; it can be a link to a dynamic resource (e.g. preview generation)."

This statement was not consistent with the allowance of fragments, hence the new statement. I can rephrase it in the upcoming PR. -- FrancoisBonnarel - 2023-08-14

(5) In my editoral PR, I've dropped a paragraph on semantics for error_message rows. This is now sufficiently addressed above that passage.

(6) sect. 3.2.9 content_qualifier: I think we should at least name the motivating use case a bit more precisely here, as in, perhaps: "It aids clients in presenting to the user the same sort of link as they go from one dataset to another within a service. For instance, suppose a service serves both continuum and line cubes. Using content_qualifier, users can configure their clients such that, as they change to a new data set, they always see the line cube even when the semantics and content\_type columns agree for both types of data." Or so.

OK for this change. I will adopt it in the next PR. -- FrancoisBonnarel - 2023-08-14

(7) Sect. 4.8: Sorry, you cannot introduce a utype ("adhoc:this") in a section called "Example: X". If you are really, really sure these "self-describing" things are useful, put them into a section of their own.

Me, I've frankly never really understood where you want to go with this, and I think there's no implementation doing any of this, so perhaps we should drop the whole thing. But if we don't drop it and somehow nonchalantly mention it in an example, at least don't introduce a new utype here. What's wrong with the name="this" you had before? You see, having two different mechanisms for what to my knowledge hasn't been implemented even once seems a bit excessive.

When dropping adhoc:this, don't forget that it is referenced in sect 4.1.

The autodescription motivation may be explained earlier in the section. For "adhoc:this" I remember Pat advocating for this. If we motivate earlier then we can restrict to a pure example here. FrancoisBonnarel - 2023-08-14

(8) I have not looked at the DataLinkImp source that's also present in the repo. If you think this ought to become a document, please extract it to a different repo; ivoatex is not designed to support two documents in one repo.

You are right. The note repo will be created in github.com/ivoa. -- FrancoisBonnarel - 2023-08-18

I've also collected a few rather editorial changes in https://github.com/ivoa-std/DataLink/pull/108



Comments from TCG member during the RFC/TCG Review Period: 2023-04-21 - 2023-05-18

WG chairs or vice chairs must read the Document, provide comments if any (including on topics not directly linked to the Group matters) or indicate that they have no comment.

IG chairs or vice chairs are also encouraged to do the same, althought their inputs are not compulsory.

TCG Chair & Vice Chair

Applications Working Group

No comment on the document, we appreciate the presence of examples that clarify the usage and implementation

--
Datalink is used and usage will increase for external webservice like simulated data, output format that are not in IVOA (Hapi Timeseries, OGC format ...)

May be change the datalink page with examples of implementation
refer to the datalink page in the document.
encourage working/interest groups to put examples as Markus did

Data Access Layer Working Group

Some minor edits only, otherwise this update looks sound.

  1. Section references would be useful in the v1.1 changes list - PR#105 raised and merged
  2. Some minor grammar updates - PR #107 raised

-- JamesDempsey - 2023-07-03

Data Model Working Group

I have a rather rudimentary understanding of DataLink, VOSI and DALI, so there are some details that I'm glossing over in my read.

I don't see any real issues/conflicts with the DM group work. However, I have 2 points/questions to raise:

  1. local_semantics: This is an identifier from a local vocabulary to help identify/select rows at a finer level than possible with just the other tags (semantics, content_type, content_qualifier). I'm guessing this is for something like ObsCore 's dataproduct_subtype. My question is that I don't really understand what the value is... is it just the tag? or URI for the local vocabulary + tag? The example serializations are no help since the DaCHs ones seem to resolve into a pretty format and I can't see the actual datalink content, and the CADC examples don't populate this field. I'd like to see, either in the document or examples, something more concrete.
I don't think authors discussed this point too much. IMHO both status would be acceptable. simple terms are enough to associate the results, but local vocab URI + tag allow to link the term to definitions and relationships, so it's reacher. Examples wil be given in the text. FrancoisBonnarel - 2023-08-18
  1. Product Type vocabulary: This directly affects the DM group, it'd be used in the Dataset model and ObsCore could be updated to use it as well. The link in the standard resolves to a 2021 version of the vocabulary. At the interop, a 2023 version was discussed which looked like it had some issues. Which vocabulary would support this REC?
    1. 2023-07-28: The referenced vocabulary now resolves to a version dated 2023-06-26 (though the event-list discussion was just going on this week). The elements and definitions in this list appear compatible with DM group usage in the ObsCore and Dataset models. -- MarkCresitelloDittmar - 2023-07-28
Yes, the product-type ivoa vocabulary is what should be used from now onwards in dataset DM, next version of ObsCore as well as DataLink content_qualifier or maybe also registry standards. -- FrancoisBonnarel - 2023-08-18

-- MarkCresitelloDittmar - 2023-06-21

Grid & Web Services Working Group

Registry Working Group

Semantics Working Group

Data Curation & Preservation Interest Group

Education Interest Group

Knowledge Discovery Interest Group

Operations Interest Group

I don't strictly speaking speak for Operations IG as of this week, but since I did most of the review before my term expired, I'll fill it in here; the TCG can decide whether this counts as an Ops endorsement or not.

As one of the authors I'm basically happy with this document, but I will draw attention to one or two issues.

  • Section 2.2 defines the standardID for this standard as ivo://ivoa.net/std/DataLink#links-1.0, followed by the comment "Note this is applicable to endpoints following any version 1.* of the DataLink standard, to avoid backward compatibility problems." In my opinion the backward compatibility problems are not sufficient to justify this choice, and the minor version should be reflected in this standard ID, i.e. it should be "...#links-1.1". This has been discussed in the open github Issue #96, and other authors seem to agree. A fix will require at least an update to the StandardsRegExt record, and also changes in the document to places where the key is referenced, including Section 2.2 and, especially, Section 3.3.1 as well as related example text. This change would amongst other things make it possible for validators to check which minor version they are supposed to be validating against. PatDowler has volunteered to write a Pull Request addressing this issue, but I can have a go if he doesn't.
  • Section 3.3.1 REQUIRES an INFO defining a suitable standardID for links response tables. The example shows such an INFO element as a child of the RESOURCE/@type="results" element, but it's not clear what restrictions there are on the location - does it have to go there, or can it be elsewhere in the VOTable? This should be clarified. If it's not required to be a child of the results resource, the example text in this section should probably be cut down.
This is done in consistency with DALI. DALI seems to insist that INFO elements should be in the primary RESOURCE (name="results"), and that other RESOURCEs may be in the VOTable. We may be more explicit on this. see next PR -- FrancoisBonnarel - 2023-08-18
  • Section 3.2.2: The final sentence says "No other additional parameters or client handling are allowed." I don't understand what is meant here. Should this sentence be removed?
See my answer to Markus above. And next PR. -- FrancoisBonnarel - 2023-08-18
  • As mentioned in Issue #82, the recommended MIME type application/x-votable+xml;content=datalink is using a content-type parameter for VOTable not endorsed in the VOTable standard, which is a bit questionable; but this is not new in this version of DataLink and it will hopefully be addressed in VOTable 1.5, so turning a blind eye is probably OK.

-- MarkTaylor - 2023-05-15

Radio Astronomy Interest Group

Solar System Interest Group

Theory Interest Group

Time Domain Interest Group

Just one suggestion, perhaps it would be a good idea to explain the notation with braces (e.g. {link}) in the "Conformance-related definitions" section (page 3).

PierreFernique - 2024-11-08

Standards and Processes Committee


TCG Vote : 2023-05-19 - 2023-06-01

If you have minor comments (typos) on the last version of the document please indicate it in the Comments column of the table and post them in the TCG comments section above with the date.

Group Yes No Abstain Comments
TCG        
Apps *      
DAL *      
DM        
GWS        
Registry        
Semantics        
DCP *      
Edu        
KDIG *      
Changed:
<
<
Ops        
>
>
Ops *      
 
Radio        
SSIG        
Theory        
TD *      
<nop>StdProc        



<!--
* Set ALLOWTOPICRENAME = TWikiAdminGroup
-->

META FILEATTACHMENT attachment="AladinDesktopDataLink1-1.png" attr="" comment="" date="1681920946" name="AladinDesktopDataLink1-1.png" path="AladinDesktopDataLink1-1.png" size="1238442" user="FrancoisBonnarel" version="1"

Revision 242023-11-08 - PierreFernique

 
META TOPICPARENT name="IvoaTCG"

DataLink 1.1 Proposed Recommendation: Request for Comments


Introduction


DataLink describes the linking of data discovery metadata to access to the data itself, further detailed metadata, related resources, and to services that perform operations on the data.

The main changes in v1.1 are

  • Generalize by adding use cases for links to content other than data files
  • VOSI-availability and VOSI-capabilities endpoints are now optional
  • Service descriptors can include exampleURL and contentType param(s), as well as DESCRIPTION, name, etc...
  • Added optional link_auth and link_authorized to signal whether authentication is necessary to use the link
  • INFO element with standardID mandatory in {links} response
  • Added content_qualifier FIELD to inform on the nature of the link target
  • Added local_semantics to identify similar links in the same DataLink service for different IDs
  • Mechanisms to recognize {links} endpoints outside ObsCore

Latest version of DataLink can be found at:

The GitHub repository for issues and source can be found at: Detailed discussion towards 1-1 can also be found on this ivoa twiki page ( last update by FrancoisBonnarel - 2023-05-04):

Reference Interoperable Implementations

Server side

GAVO implemented the following changes = content_qualifier and local_semantics, service descriptor additional features such as DESCRIPTION? name, content_type, etc....

As a matter of example, a couple of links response for various obscore/sia/ssa tables in GAVO server

All these are combined with various semantics or content_type values

-- MarkusDemleitner - 2023-07-10


CADC has implemented the following in https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops:

  • INFO element with standardID in links response
  • new optional fields in links response: local_semantics (no content yet but can be populated with default vocab in most cases), content_qualifier (no content, not likely to use), link_auth, link_authorized
  • contentType param in service descriptors (where applicable)
IRIS image: https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/IRIS?f212h000/IRAS-25um shows link_auth=optional and link_authorized=true because one can authenticate but the data is public.

new CFHT data: anonymous use of https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/CFHT?2773629/2773629o shows link_auth=optional and link_authorized=false because the data is still proprietary and the caller is anonymous; if an authorized user makes the call they will see authorized=true. It's hard to demonstrate that for a general audience.

The core CADC implementation is available as a library (cadc-datalink-server) in MavenCentral with source code at https://github.com/opencadc/dal.git; the caom2-specific logic is available in a library (caom2-datalink-server) with source at https://github.com/opencadc/caom2service.git -- the core lib is also used in ALMA DataLink service but may not yet be released with the latest features.

Client side

TOPCAT v4.8-8 and later displays additional features in service descriptor and makes use of additional links table FIELD such as content_qualifier, authorization and local_semantics; The tool behavior is adapted to the content of these new FIELDS. For example Activation Actions suggested by topcat for the links not only depend on content_type but also content_qualifier, and local_semantics is used to guess which link a user is interested in based on previous selections.

AladinDesktop is going to adapt to those new features too. (see prototype screenshot)

The CADC DownloadManager (https://github.com/opencadc/apps.git) includes a simple DataLink client class so it can resolve publisherID values into 1..* URLs for download; this code hasn't changed as a result of DataLink -1.1. The CADC AdvancedSearch web portal makes calls to the above caom2ops/datalink service to find previews and download info for each row (publisherID): it makes use of link_authorized to decide to display the download options (or not), which prevents users from selecting downloads/links when they are not authorized and the request will be rejected later.

Implementations Validators

The following validators are available for DataLink

  • datalinklint which is part of STILTS. From STILTS v3.4-8 there is a version parameter that can be set to 1.0 or 1.1 (effectively defaults to v1.1)
  • Show my DataLink which is part of DaCHS



Comments from the IVOA Community during RFC/TCG review period: 2023-04-21 - 2023-05-18

The comments from the TCG members during the RFC/TCG review should be included in the next section.

In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your Wiki Name so that authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.

Additional discussion about any of the comments or responses can be conducted on the WG mailing list. However, please be sure to enter your initial comments here for full consideration in any future revisions of this document

Community Comment by Markus Demleitner

(1) The standard ID: I'm pretty sure we discussed that before, but I'm really unsure how we came to the conclusion that even Datalink 1.1 still has the ivoid of ivo://ivoa.net/std/DataLink#links-1.0.

Yes, we have to do crazy stuff like that for the schema URIs due to the way XML element names are compared. But there is in general no analogous need with ivoids, because we control the rules how to compare them in what situations.

Does anyone remember why we went for links-1.0 here? If not, I'd suggest links-1. I volunteer for adding a brief explanation about how clients should disregard the minor version for normal operations.

(2) I am entirely unhappy with section 3.1.1, starting with its title, which probably should be something like "Datalinks in VOTable columns". And then the first paragraph should probably say something more concrete like perhaps "Columns containing datalinks SHOULD be marked with a UCD of X.Y.Z and a LINK-typed child in its FIELD like this:

Added:
>
>
 
   <LINK whatever="blabla"/>"

And the second paragraph I'd say doesn't belong here at all (it could go to, perhaps, 1.2.7 or a use case discussing datalinks as primary results if we think we need to be explicit about this).

There are use cases behind this. When datalink links response is hooked to table rows outside the context of ObsTAP /SIA2 how do we generate/recognize the DataLink URL ?

Of course we can use the Service Descriptor with the single ID parameter if the DataLink can be parametrized by and "id" from one of the columns. But in that case the descriptor would be doing exactly the same than the LINK element proposed here as included in the appropriate FIELD and is much less verbose. And it's pretty correct VOTable standard. The FIELD itself should not be described by a datalink ucd because it's probably generally an id.

The second paragraph refers to use cases where the URL is not built from the content of one FIELD and when the URL is ad hoc and should be the content of a FIELD. Using the same utypes than the one used in Obscore responses seems reasonable. This is for example adapted to SIA1 or SSA responses. I think this has nothing to do with recursive datalink.

We may try to rephrase all this if this is unclear, but the intent has to be kept.

-- FrancoisBonnarel - 2023-08-12

(3) In 3.2, is says:

If an error occurs while processing an ID value, there \rfcshould\ be at least one row for that ID value and an error\_message

The way the pyvo datalink client is written, we have to make that an unconditional MUST, or pyvo will keep requesting any failing ID (and frankly I'm unsure how else to implement this given multi-ID and overflows): it will only remove an ID of its list of ids to query if it gets at least one row back for it. Perhaps:

A service MUST return at least one row for each ID passed in.

[ceterum censeo we should have let ID be single-valued; it would have made everything soo much simpler and nothing really much harder/slower]

(4) 3.2.2, second paragraph: I had to puzzle quite a bit about this, starting with wondering what a "dereferenceable URL" might be. I'd suggest to replace the entire paragraph with "Access URLs may have fragment parts, which could, for instance, refer to id-ed elements within XML documents or extensions within FITS files. As in URIs in general, the interpretation of a fragment identifier depends on the media type."

"dereferanceable" was used in the sense that it can be fully accessed by http. Which is not the case in URN in general or URL with fragments. For the latter the client is supposed to interpret the fragment. See: https://en.wikipedia.org/wiki/URI_fragment

Apart from that I agree with your rephrasing. FrancoisBonnarel - 2023-08-14"

This will also drop the "No other additional parameters or client handling are allowed." -- if this forbids query strings on access URIs, I'd strongly disagree. If this means something else, we'd have to write that something else.

In version 1.0 we could read:

"The access_url column contains a URL to download a single resource. The URL in this column must be usable as-is with no additional parameters or client handling; it can be a link to a dynamic resource (e.g. preview generation)."

This statement was not consistent with the allowance of fragments, hence the new statement. I can rephrase it in the upcoming PR. -- FrancoisBonnarel - 2023-08-14

(5) In my editoral PR, I've dropped a paragraph on semantics for error_message rows. This is now sufficiently addressed above that passage.

(6) sect. 3.2.9 content_qualifier: I think we should at least name the motivating use case a bit more precisely here, as in, perhaps: "It aids clients in presenting to the user the same sort of link as they go from one dataset to another within a service. For instance, suppose a service serves both continuum and line cubes. Using content_qualifier, users can configure their clients such that, as they change to a new data set, they always see the line cube even when the semantics and content\_type columns agree for both types of data." Or so.

OK for this change. I will adopt it in the next PR. -- FrancoisBonnarel - 2023-08-14

(7) Sect. 4.8: Sorry, you cannot introduce a utype ("adhoc:this") in a section called "Example: X". If you are really, really sure these "self-describing" things are useful, put them into a section of their own.

Me, I've frankly never really understood where you want to go with this, and I think there's no implementation doing any of this, so perhaps we should drop the whole thing. But if we don't drop it and somehow nonchalantly mention it in an example, at least don't introduce a new utype here. What's wrong with the name="this" you had before? You see, having two different mechanisms for what to my knowledge hasn't been implemented even once seems a bit excessive.

When dropping adhoc:this, don't forget that it is referenced in sect 4.1.

The autodescription motivation may be explained earlier in the section. For "adhoc:this" I remember Pat advocating for this. If we motivate earlier then we can restrict to a pure example here. FrancoisBonnarel - 2023-08-14

(8) I have not looked at the DataLinkImp source that's also present in the repo. If you think this ought to become a document, please extract it to a different repo; ivoatex is not designed to support two documents in one repo.

You are right. The note repo will be created in github.com/ivoa. -- FrancoisBonnarel - 2023-08-18

I've also collected a few rather editorial changes in https://github.com/ivoa-std/DataLink/pull/108



Comments from TCG member during the RFC/TCG Review Period: 2023-04-21 - 2023-05-18

WG chairs or vice chairs must read the Document, provide comments if any (including on topics not directly linked to the Group matters) or indicate that they have no comment.

IG chairs or vice chairs are also encouraged to do the same, althought their inputs are not compulsory.

TCG Chair & Vice Chair

Applications Working Group

No comment on the document, we appreciate the presence of examples that clarify the usage and implementation

--
Datalink is used and usage will increase for external webservice like simulated data, output format that are not in IVOA (Hapi Timeseries, OGC format ...)

May be change the datalink page with examples of implementation
refer to the datalink page in the document.
encourage working/interest groups to put examples as Markus did

Data Access Layer Working Group

Some minor edits only, otherwise this update looks sound.

  1. Section references would be useful in the v1.1 changes list - PR#105 raised and merged
  2. Some minor grammar updates - PR #107 raised

-- JamesDempsey - 2023-07-03

Data Model Working Group

I have a rather rudimentary understanding of DataLink, VOSI and DALI, so there are some details that I'm glossing over in my read.

I don't see any real issues/conflicts with the DM group work. However, I have 2 points/questions to raise:

  1. local_semantics: This is an identifier from a local vocabulary to help identify/select rows at a finer level than possible with just the other tags (semantics, content_type, content_qualifier). I'm guessing this is for something like ObsCore 's dataproduct_subtype. My question is that I don't really understand what the value is... is it just the tag? or URI for the local vocabulary + tag? The example serializations are no help since the DaCHs ones seem to resolve into a pretty format and I can't see the actual datalink content, and the CADC examples don't populate this field. I'd like to see, either in the document or examples, something more concrete.
I don't think authors discussed this point too much. IMHO both status would be acceptable. simple terms are enough to associate the results, but local vocab URI + tag allow to link the term to definitions and relationships, so it's reacher. Examples wil be given in the text. FrancoisBonnarel - 2023-08-18
  1. Product Type vocabulary: This directly affects the DM group, it'd be used in the Dataset model and ObsCore could be updated to use it as well. The link in the standard resolves to a 2021 version of the vocabulary. At the interop, a 2023 version was discussed which looked like it had some issues. Which vocabulary would support this REC?
    1. 2023-07-28: The referenced vocabulary now resolves to a version dated 2023-06-26 (though the event-list discussion was just going on this week). The elements and definitions in this list appear compatible with DM group usage in the ObsCore and Dataset models. -- MarkCresitelloDittmar - 2023-07-28
Yes, the product-type ivoa vocabulary is what should be used from now onwards in dataset DM, next version of ObsCore as well as DataLink content_qualifier or maybe also registry standards. -- FrancoisBonnarel - 2023-08-18

-- MarkCresitelloDittmar - 2023-06-21

Grid & Web Services Working Group

Registry Working Group

Semantics Working Group

Data Curation & Preservation Interest Group

Education Interest Group

Knowledge Discovery Interest Group

Operations Interest Group

I don't strictly speaking speak for Operations IG as of this week, but since I did most of the review before my term expired, I'll fill it in here; the TCG can decide whether this counts as an Ops endorsement or not.

As one of the authors I'm basically happy with this document, but I will draw attention to one or two issues.

  • Section 2.2 defines the standardID for this standard as ivo://ivoa.net/std/DataLink#links-1.0, followed by the comment "Note this is applicable to endpoints following any version 1.* of the DataLink standard, to avoid backward compatibility problems." In my opinion the backward compatibility problems are not sufficient to justify this choice, and the minor version should be reflected in this standard ID, i.e. it should be "...#links-1.1". This has been discussed in the open github Issue #96, and other authors seem to agree. A fix will require at least an update to the StandardsRegExt record, and also changes in the document to places where the key is referenced, including Section 2.2 and, especially, Section 3.3.1 as well as related example text. This change would amongst other things make it possible for validators to check which minor version they are supposed to be validating against. PatDowler has volunteered to write a Pull Request addressing this issue, but I can have a go if he doesn't.
  • Section 3.3.1 REQUIRES an INFO defining a suitable standardID for links response tables. The example shows such an INFO element as a child of the RESOURCE/@type="results" element, but it's not clear what restrictions there are on the location - does it have to go there, or can it be elsewhere in the VOTable? This should be clarified. If it's not required to be a child of the results resource, the example text in this section should probably be cut down.
This is done in consistency with DALI. DALI seems to insist that INFO elements should be in the primary RESOURCE (name="results"), and that other RESOURCEs may be in the VOTable. We may be more explicit on this. see next PR -- FrancoisBonnarel - 2023-08-18
  • Section 3.2.2: The final sentence says "No other additional parameters or client handling are allowed." I don't understand what is meant here. Should this sentence be removed?
See my answer to Markus above. And next PR. -- FrancoisBonnarel - 2023-08-18
  • As mentioned in Issue #82, the recommended MIME type application/x-votable+xml;content=datalink is using a content-type parameter for VOTable not endorsed in the VOTable standard, which is a bit questionable; but this is not new in this version of DataLink and it will hopefully be addressed in VOTable 1.5, so turning a blind eye is probably OK.

-- MarkTaylor - 2023-05-15

Radio Astronomy Interest Group

Solar System Interest Group

Theory Interest Group

Deleted:
<
<
 

Time Domain Interest Group

Added:
>
>
Just one suggestion, perhaps it would be a good idea to explain the notation with braces (e.g. {link}) in the "Conformance-related definitions" section (page 3).

PierreFernique - 2024-11-08

 

Standards and Processes Committee


TCG Vote : 2023-05-19 - 2023-06-01

If you have minor comments (typos) on the last version of the document please indicate it in the Comments column of the table and post them in the TCG comments section above with the date.

Group Yes No Abstain Comments
TCG        
Apps *      
DAL *      
DM        
GWS        
Registry        
Semantics        
DCP *      
Edu        
KDIG *      
Ops        
Radio        
SSIG        
Theory        
TD *      
<nop>StdProc        



<!--
* Set ALLOWTOPICRENAME = TWikiAdminGroup
-->

META FILEATTACHMENT attachment="AladinDesktopDataLink1-1.png" attr="" comment="" date="1681920946" name="AladinDesktopDataLink1-1.png" path="AladinDesktopDataLink1-1.png" size="1238442" user="FrancoisBonnarel" version="1"

Revision 232023-11-08 - RafaelMartinezGalarza

 
META TOPICPARENT name="IvoaTCG"

DataLink 1.1 Proposed Recommendation: Request for Comments


Introduction


DataLink describes the linking of data discovery metadata to access to the data itself, further detailed metadata, related resources, and to services that perform operations on the data.

The main changes in v1.1 are

  • Generalize by adding use cases for links to content other than data files
  • VOSI-availability and VOSI-capabilities endpoints are now optional
  • Service descriptors can include exampleURL and contentType param(s), as well as DESCRIPTION, name, etc...
  • Added optional link_auth and link_authorized to signal whether authentication is necessary to use the link
  • INFO element with standardID mandatory in {links} response
  • Added content_qualifier FIELD to inform on the nature of the link target
  • Added local_semantics to identify similar links in the same DataLink service for different IDs
  • Mechanisms to recognize {links} endpoints outside ObsCore

Latest version of DataLink can be found at:

The GitHub repository for issues and source can be found at: Detailed discussion towards 1-1 can also be found on this ivoa twiki page ( last update by FrancoisBonnarel - 2023-05-04):

Reference Interoperable Implementations

Server side

GAVO implemented the following changes = content_qualifier and local_semantics, service descriptor additional features such as DESCRIPTION? name, content_type, etc....

As a matter of example, a couple of links response for various obscore/sia/ssa tables in GAVO server

All these are combined with various semantics or content_type values

-- MarkusDemleitner - 2023-07-10


CADC has implemented the following in https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops:

  • INFO element with standardID in links response
  • new optional fields in links response: local_semantics (no content yet but can be populated with default vocab in most cases), content_qualifier (no content, not likely to use), link_auth, link_authorized
  • contentType param in service descriptors (where applicable)
IRIS image: https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/IRIS?f212h000/IRAS-25um shows link_auth=optional and link_authorized=true because one can authenticate but the data is public.

new CFHT data: anonymous use of https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/CFHT?2773629/2773629o shows link_auth=optional and link_authorized=false because the data is still proprietary and the caller is anonymous; if an authorized user makes the call they will see authorized=true. It's hard to demonstrate that for a general audience.

The core CADC implementation is available as a library (cadc-datalink-server) in MavenCentral with source code at https://github.com/opencadc/dal.git; the caom2-specific logic is available in a library (caom2-datalink-server) with source at https://github.com/opencadc/caom2service.git -- the core lib is also used in ALMA DataLink service but may not yet be released with the latest features.

Client side

TOPCAT v4.8-8 and later displays additional features in service descriptor and makes use of additional links table FIELD such as content_qualifier, authorization and local_semantics; The tool behavior is adapted to the content of these new FIELDS. For example Activation Actions suggested by topcat for the links not only depend on content_type but also content_qualifier, and local_semantics is used to guess which link a user is interested in based on previous selections.

AladinDesktop is going to adapt to those new features too. (see prototype screenshot)

The CADC DownloadManager (https://github.com/opencadc/apps.git) includes a simple DataLink client class so it can resolve publisherID values into 1..* URLs for download; this code hasn't changed as a result of DataLink -1.1. The CADC AdvancedSearch web portal makes calls to the above caom2ops/datalink service to find previews and download info for each row (publisherID): it makes use of link_authorized to decide to display the download options (or not), which prevents users from selecting downloads/links when they are not authorized and the request will be rejected later.

Implementations Validators

The following validators are available for DataLink

  • datalinklint which is part of STILTS. From STILTS v3.4-8 there is a version parameter that can be set to 1.0 or 1.1 (effectively defaults to v1.1)
  • Show my DataLink which is part of DaCHS



Comments from the IVOA Community during RFC/TCG review period: 2023-04-21 - 2023-05-18

The comments from the TCG members during the RFC/TCG review should be included in the next section.

In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your Wiki Name so that authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.

Additional discussion about any of the comments or responses can be conducted on the WG mailing list. However, please be sure to enter your initial comments here for full consideration in any future revisions of this document

Community Comment by Markus Demleitner

(1) The standard ID: I'm pretty sure we discussed that before, but I'm really unsure how we came to the conclusion that even Datalink 1.1 still has the ivoid of ivo://ivoa.net/std/DataLink#links-1.0.

Yes, we have to do crazy stuff like that for the schema URIs due to the way XML element names are compared. But there is in general no analogous need with ivoids, because we control the rules how to compare them in what situations.

Does anyone remember why we went for links-1.0 here? If not, I'd suggest links-1. I volunteer for adding a brief explanation about how clients should disregard the minor version for normal operations.

(2) I am entirely unhappy with section 3.1.1, starting with its title, which probably should be something like "Datalinks in VOTable columns". And then the first paragraph should probably say something more concrete like perhaps "Columns containing datalinks SHOULD be marked with a UCD of X.Y.Z and a LINK-typed child in its FIELD like this:

Deleted:
<
<
 
   <LINK whatever="blabla"/>"

And the second paragraph I'd say doesn't belong here at all (it could go to, perhaps, 1.2.7 or a use case discussing datalinks as primary results if we think we need to be explicit about this).

There are use cases behind this. When datalink links response is hooked to table rows outside the context of ObsTAP /SIA2 how do we generate/recognize the DataLink URL ?

Of course we can use the Service Descriptor with the single ID parameter if the DataLink can be parametrized by and "id" from one of the columns. But in that case the descriptor would be doing exactly the same than the LINK element proposed here as included in the appropriate FIELD and is much less verbose. And it's pretty correct VOTable standard. The FIELD itself should not be described by a datalink ucd because it's probably generally an id.

The second paragraph refers to use cases where the URL is not built from the content of one FIELD and when the URL is ad hoc and should be the content of a FIELD. Using the same utypes than the one used in Obscore responses seems reasonable. This is for example adapted to SIA1 or SSA responses. I think this has nothing to do with recursive datalink.

We may try to rephrase all this if this is unclear, but the intent has to be kept.

-- FrancoisBonnarel - 2023-08-12

(3) In 3.2, is says:

If an error occurs while processing an ID value, there \rfcshould\ be at least one row for that ID value and an error\_message

The way the pyvo datalink client is written, we have to make that an unconditional MUST, or pyvo will keep requesting any failing ID (and frankly I'm unsure how else to implement this given multi-ID and overflows): it will only remove an ID of its list of ids to query if it gets at least one row back for it. Perhaps:

A service MUST return at least one row for each ID passed in.

[ceterum censeo we should have let ID be single-valued; it would have made everything soo much simpler and nothing really much harder/slower]

(4) 3.2.2, second paragraph: I had to puzzle quite a bit about this, starting with wondering what a "dereferenceable URL" might be. I'd suggest to replace the entire paragraph with "Access URLs may have fragment parts, which could, for instance, refer to id-ed elements within XML documents or extensions within FITS files. As in URIs in general, the interpretation of a fragment identifier depends on the media type."

"dereferanceable" was used in the sense that it can be fully accessed by http. Which is not the case in URN in general or URL with fragments. For the latter the client is supposed to interpret the fragment. See: https://en.wikipedia.org/wiki/URI_fragment

Apart from that I agree with your rephrasing. FrancoisBonnarel - 2023-08-14"

This will also drop the "No other additional parameters or client handling are allowed." -- if this forbids query strings on access URIs, I'd strongly disagree. If this means something else, we'd have to write that something else.

In version 1.0 we could read:

"The access_url column contains a URL to download a single resource. The URL in this column must be usable as-is with no additional parameters or client handling; it can be a link to a dynamic resource (e.g. preview generation)."

This statement was not consistent with the allowance of fragments, hence the new statement. I can rephrase it in the upcoming PR. -- FrancoisBonnarel - 2023-08-14

(5) In my editoral PR, I've dropped a paragraph on semantics for error_message rows. This is now sufficiently addressed above that passage.

(6) sect. 3.2.9 content_qualifier: I think we should at least name the motivating use case a bit more precisely here, as in, perhaps: "It aids clients in presenting to the user the same sort of link as they go from one dataset to another within a service. For instance, suppose a service serves both continuum and line cubes. Using content_qualifier, users can configure their clients such that, as they change to a new data set, they always see the line cube even when the semantics and content\_type columns agree for both types of data." Or so.

OK for this change. I will adopt it in the next PR. -- FrancoisBonnarel - 2023-08-14

(7) Sect. 4.8: Sorry, you cannot introduce a utype ("adhoc:this") in a section called "Example: X". If you are really, really sure these "self-describing" things are useful, put them into a section of their own.

Me, I've frankly never really understood where you want to go with this, and I think there's no implementation doing any of this, so perhaps we should drop the whole thing. But if we don't drop it and somehow nonchalantly mention it in an example, at least don't introduce a new utype here. What's wrong with the name="this" you had before? You see, having two different mechanisms for what to my knowledge hasn't been implemented even once seems a bit excessive.

When dropping adhoc:this, don't forget that it is referenced in sect 4.1.

The autodescription motivation may be explained earlier in the section. For "adhoc:this" I remember Pat advocating for this. If we motivate earlier then we can restrict to a pure example here. FrancoisBonnarel - 2023-08-14

(8) I have not looked at the DataLinkImp source that's also present in the repo. If you think this ought to become a document, please extract it to a different repo; ivoatex is not designed to support two documents in one repo.

You are right. The note repo will be created in github.com/ivoa. -- FrancoisBonnarel - 2023-08-18

I've also collected a few rather editorial changes in https://github.com/ivoa-std/DataLink/pull/108



Comments from TCG member during the RFC/TCG Review Period: 2023-04-21 - 2023-05-18

WG chairs or vice chairs must read the Document, provide comments if any (including on topics not directly linked to the Group matters) or indicate that they have no comment.

IG chairs or vice chairs are also encouraged to do the same, althought their inputs are not compulsory.

TCG Chair & Vice Chair

Applications Working Group

No comment on the document, we appreciate the presence of examples that clarify the usage and implementation

--
Datalink is used and usage will increase for external webservice like simulated data, output format that are not in IVOA (Hapi Timeseries, OGC format ...)

May be change the datalink page with examples of implementation
refer to the datalink page in the document.
encourage working/interest groups to put examples as Markus did

Data Access Layer Working Group

Some minor edits only, otherwise this update looks sound.

  1. Section references would be useful in the v1.1 changes list - PR#105 raised and merged
  2. Some minor grammar updates - PR #107 raised

-- JamesDempsey - 2023-07-03

Data Model Working Group

I have a rather rudimentary understanding of DataLink, VOSI and DALI, so there are some details that I'm glossing over in my read.

I don't see any real issues/conflicts with the DM group work. However, I have 2 points/questions to raise:

  1. local_semantics: This is an identifier from a local vocabulary to help identify/select rows at a finer level than possible with just the other tags (semantics, content_type, content_qualifier). I'm guessing this is for something like ObsCore 's dataproduct_subtype. My question is that I don't really understand what the value is... is it just the tag? or URI for the local vocabulary + tag? The example serializations are no help since the DaCHs ones seem to resolve into a pretty format and I can't see the actual datalink content, and the CADC examples don't populate this field. I'd like to see, either in the document or examples, something more concrete.
I don't think authors discussed this point too much. IMHO both status would be acceptable. simple terms are enough to associate the results, but local vocab URI + tag allow to link the term to definitions and relationships, so it's reacher. Examples wil be given in the text. FrancoisBonnarel - 2023-08-18
  1. Product Type vocabulary: This directly affects the DM group, it'd be used in the Dataset model and ObsCore could be updated to use it as well. The link in the standard resolves to a 2021 version of the vocabulary. At the interop, a 2023 version was discussed which looked like it had some issues. Which vocabulary would support this REC?
    1. 2023-07-28: The referenced vocabulary now resolves to a version dated 2023-06-26 (though the event-list discussion was just going on this week). The elements and definitions in this list appear compatible with DM group usage in the ObsCore and Dataset models. -- MarkCresitelloDittmar - 2023-07-28
Yes, the product-type ivoa vocabulary is what should be used from now onwards in dataset DM, next version of ObsCore as well as DataLink content_qualifier or maybe also registry standards. -- FrancoisBonnarel - 2023-08-18

-- MarkCresitelloDittmar - 2023-06-21

Grid & Web Services Working Group

Registry Working Group

Semantics Working Group

Data Curation & Preservation Interest Group

Education Interest Group

Knowledge Discovery Interest Group

Operations Interest Group

I don't strictly speaking speak for Operations IG as of this week, but since I did most of the review before my term expired, I'll fill it in here; the TCG can decide whether this counts as an Ops endorsement or not.

As one of the authors I'm basically happy with this document, but I will draw attention to one or two issues.

  • Section 2.2 defines the standardID for this standard as ivo://ivoa.net/std/DataLink#links-1.0, followed by the comment "Note this is applicable to endpoints following any version 1.* of the DataLink standard, to avoid backward compatibility problems." In my opinion the backward compatibility problems are not sufficient to justify this choice, and the minor version should be reflected in this standard ID, i.e. it should be "...#links-1.1". This has been discussed in the open github Issue #96, and other authors seem to agree. A fix will require at least an update to the StandardsRegExt record, and also changes in the document to places where the key is referenced, including Section 2.2 and, especially, Section 3.3.1 as well as related example text. This change would amongst other things make it possible for validators to check which minor version they are supposed to be validating against. PatDowler has volunteered to write a Pull Request addressing this issue, but I can have a go if he doesn't.
  • Section 3.3.1 REQUIRES an INFO defining a suitable standardID for links response tables. The example shows such an INFO element as a child of the RESOURCE/@type="results" element, but it's not clear what restrictions there are on the location - does it have to go there, or can it be elsewhere in the VOTable? This should be clarified. If it's not required to be a child of the results resource, the example text in this section should probably be cut down.
This is done in consistency with DALI. DALI seems to insist that INFO elements should be in the primary RESOURCE (name="results"), and that other RESOURCEs may be in the VOTable. We may be more explicit on this. see next PR -- FrancoisBonnarel - 2023-08-18
  • Section 3.2.2: The final sentence says "No other additional parameters or client handling are allowed." I don't understand what is meant here. Should this sentence be removed?
See my answer to Markus above. And next PR. -- FrancoisBonnarel - 2023-08-18
  • As mentioned in Issue #82, the recommended MIME type application/x-votable+xml;content=datalink is using a content-type parameter for VOTable not endorsed in the VOTable standard, which is a bit questionable; but this is not new in this version of DataLink and it will hopefully be addressed in VOTable 1.5, so turning a blind eye is probably OK.

-- MarkTaylor - 2023-05-15

Radio Astronomy Interest Group

Solar System Interest Group

Theory Interest Group

Time Domain Interest Group

Standards and Processes Committee


TCG Vote : 2023-05-19 - 2023-06-01

If you have minor comments (typos) on the last version of the document please indicate it in the Comments column of the table and post them in the TCG comments section above with the date.

Group Yes No Abstain Comments
TCG        
Apps *      
DAL *      
DM        
GWS        
Registry        
Semantics        
DCP *      
Edu        
KDIG *      
Ops        
Radio        
SSIG        
Theory        
Changed:
<
<
TD        
>
>
TD *      
 
<nop>StdProc        



<!--
* Set ALLOWTOPICRENAME = TWikiAdminGroup
-->

META FILEATTACHMENT attachment="AladinDesktopDataLink1-1.png" attr="" comment="" date="1681920946" name="AladinDesktopDataLink1-1.png" path="AladinDesktopDataLink1-1.png" size="1238442" user="FrancoisBonnarel" version="1"

Revision 222023-11-08 - RaffaeleDAbrusco

 
META TOPICPARENT name="IvoaTCG"

DataLink 1.1 Proposed Recommendation: Request for Comments


Introduction


DataLink describes the linking of data discovery metadata to access to the data itself, further detailed metadata, related resources, and to services that perform operations on the data.

The main changes in v1.1 are

  • Generalize by adding use cases for links to content other than data files
  • VOSI-availability and VOSI-capabilities endpoints are now optional
  • Service descriptors can include exampleURL and contentType param(s), as well as DESCRIPTION, name, etc...
  • Added optional link_auth and link_authorized to signal whether authentication is necessary to use the link
  • INFO element with standardID mandatory in {links} response
  • Added content_qualifier FIELD to inform on the nature of the link target
  • Added local_semantics to identify similar links in the same DataLink service for different IDs
  • Mechanisms to recognize {links} endpoints outside ObsCore

Latest version of DataLink can be found at:

The GitHub repository for issues and source can be found at: Detailed discussion towards 1-1 can also be found on this ivoa twiki page ( last update by FrancoisBonnarel - 2023-05-04):

Reference Interoperable Implementations

Server side

GAVO implemented the following changes = content_qualifier and local_semantics, service descriptor additional features such as DESCRIPTION? name, content_type, etc....

As a matter of example, a couple of links response for various obscore/sia/ssa tables in GAVO server

All these are combined with various semantics or content_type values

-- MarkusDemleitner - 2023-07-10


CADC has implemented the following in https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops:

  • INFO element with standardID in links response
  • new optional fields in links response: local_semantics (no content yet but can be populated with default vocab in most cases), content_qualifier (no content, not likely to use), link_auth, link_authorized
  • contentType param in service descriptors (where applicable)
IRIS image: https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/IRIS?f212h000/IRAS-25um shows link_auth=optional and link_authorized=true because one can authenticate but the data is public.

new CFHT data: anonymous use of https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/CFHT?2773629/2773629o shows link_auth=optional and link_authorized=false because the data is still proprietary and the caller is anonymous; if an authorized user makes the call they will see authorized=true. It's hard to demonstrate that for a general audience.

The core CADC implementation is available as a library (cadc-datalink-server) in MavenCentral with source code at https://github.com/opencadc/dal.git; the caom2-specific logic is available in a library (caom2-datalink-server) with source at https://github.com/opencadc/caom2service.git -- the core lib is also used in ALMA DataLink service but may not yet be released with the latest features.

Client side

TOPCAT v4.8-8 and later displays additional features in service descriptor and makes use of additional links table FIELD such as content_qualifier, authorization and local_semantics; The tool behavior is adapted to the content of these new FIELDS. For example Activation Actions suggested by topcat for the links not only depend on content_type but also content_qualifier, and local_semantics is used to guess which link a user is interested in based on previous selections.

AladinDesktop is going to adapt to those new features too. (see prototype screenshot)

The CADC DownloadManager (https://github.com/opencadc/apps.git) includes a simple DataLink client class so it can resolve publisherID values into 1..* URLs for download; this code hasn't changed as a result of DataLink -1.1. The CADC AdvancedSearch web portal makes calls to the above caom2ops/datalink service to find previews and download info for each row (publisherID): it makes use of link_authorized to decide to display the download options (or not), which prevents users from selecting downloads/links when they are not authorized and the request will be rejected later.

Implementations Validators

The following validators are available for DataLink

  • datalinklint which is part of STILTS. From STILTS v3.4-8 there is a version parameter that can be set to 1.0 or 1.1 (effectively defaults to v1.1)
  • Show my DataLink which is part of DaCHS



Comments from the IVOA Community during RFC/TCG review period: 2023-04-21 - 2023-05-18

The comments from the TCG members during the RFC/TCG review should be included in the next section.

In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your Wiki Name so that authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.

Additional discussion about any of the comments or responses can be conducted on the WG mailing list. However, please be sure to enter your initial comments here for full consideration in any future revisions of this document

Community Comment by Markus Demleitner

(1) The standard ID: I'm pretty sure we discussed that before, but I'm really unsure how we came to the conclusion that even Datalink 1.1 still has the ivoid of ivo://ivoa.net/std/DataLink#links-1.0.

Yes, we have to do crazy stuff like that for the schema URIs due to the way XML element names are compared. But there is in general no analogous need with ivoids, because we control the rules how to compare them in what situations.

Does anyone remember why we went for links-1.0 here? If not, I'd suggest links-1. I volunteer for adding a brief explanation about how clients should disregard the minor version for normal operations.

(2) I am entirely unhappy with section 3.1.1, starting with its title, which probably should be something like "Datalinks in VOTable columns". And then the first paragraph should probably say something more concrete like perhaps "Columns containing datalinks SHOULD be marked with a UCD of X.Y.Z and a LINK-typed child in its FIELD like this:

Added:
>
>
 
   <LINK whatever="blabla"/>"

And the second paragraph I'd say doesn't belong here at all (it could go to, perhaps, 1.2.7 or a use case discussing datalinks as primary results if we think we need to be explicit about this).

There are use cases behind this. When datalink links response is hooked to table rows outside the context of ObsTAP /SIA2 how do we generate/recognize the DataLink URL ?

Of course we can use the Service Descriptor with the single ID parameter if the DataLink can be parametrized by and "id" from one of the columns. But in that case the descriptor would be doing exactly the same than the LINK element proposed here as included in the appropriate FIELD and is much less verbose. And it's pretty correct VOTable standard. The FIELD itself should not be described by a datalink ucd because it's probably generally an id.

The second paragraph refers to use cases where the URL is not built from the content of one FIELD and when the URL is ad hoc and should be the content of a FIELD. Using the same utypes than the one used in Obscore responses seems reasonable. This is for example adapted to SIA1 or SSA responses. I think this has nothing to do with recursive datalink.

We may try to rephrase all this if this is unclear, but the intent has to be kept.

-- FrancoisBonnarel - 2023-08-12

(3) In 3.2, is says:

If an error occurs while processing an ID value, there \rfcshould\ be at least one row for that ID value and an error\_message

The way the pyvo datalink client is written, we have to make that an unconditional MUST, or pyvo will keep requesting any failing ID (and frankly I'm unsure how else to implement this given multi-ID and overflows): it will only remove an ID of its list of ids to query if it gets at least one row back for it. Perhaps:

A service MUST return at least one row for each ID passed in.

[ceterum censeo we should have let ID be single-valued; it would have made everything soo much simpler and nothing really much harder/slower]

(4) 3.2.2, second paragraph: I had to puzzle quite a bit about this, starting with wondering what a "dereferenceable URL" might be. I'd suggest to replace the entire paragraph with "Access URLs may have fragment parts, which could, for instance, refer to id-ed elements within XML documents or extensions within FITS files. As in URIs in general, the interpretation of a fragment identifier depends on the media type."

"dereferanceable" was used in the sense that it can be fully accessed by http. Which is not the case in URN in general or URL with fragments. For the latter the client is supposed to interpret the fragment. See: https://en.wikipedia.org/wiki/URI_fragment

Apart from that I agree with your rephrasing. FrancoisBonnarel - 2023-08-14"

This will also drop the "No other additional parameters or client handling are allowed." -- if this forbids query strings on access URIs, I'd strongly disagree. If this means something else, we'd have to write that something else.

In version 1.0 we could read:

"The access_url column contains a URL to download a single resource. The URL in this column must be usable as-is with no additional parameters or client handling; it can be a link to a dynamic resource (e.g. preview generation)."

This statement was not consistent with the allowance of fragments, hence the new statement. I can rephrase it in the upcoming PR. -- FrancoisBonnarel - 2023-08-14

(5) In my editoral PR, I've dropped a paragraph on semantics for error_message rows. This is now sufficiently addressed above that passage.

(6) sect. 3.2.9 content_qualifier: I think we should at least name the motivating use case a bit more precisely here, as in, perhaps: "It aids clients in presenting to the user the same sort of link as they go from one dataset to another within a service. For instance, suppose a service serves both continuum and line cubes. Using content_qualifier, users can configure their clients such that, as they change to a new data set, they always see the line cube even when the semantics and content\_type columns agree for both types of data." Or so.

OK for this change. I will adopt it in the next PR. -- FrancoisBonnarel - 2023-08-14

(7) Sect. 4.8: Sorry, you cannot introduce a utype ("adhoc:this") in a section called "Example: X". If you are really, really sure these "self-describing" things are useful, put them into a section of their own.

Me, I've frankly never really understood where you want to go with this, and I think there's no implementation doing any of this, so perhaps we should drop the whole thing. But if we don't drop it and somehow nonchalantly mention it in an example, at least don't introduce a new utype here. What's wrong with the name="this" you had before? You see, having two different mechanisms for what to my knowledge hasn't been implemented even once seems a bit excessive.

When dropping adhoc:this, don't forget that it is referenced in sect 4.1.

The autodescription motivation may be explained earlier in the section. For "adhoc:this" I remember Pat advocating for this. If we motivate earlier then we can restrict to a pure example here. FrancoisBonnarel - 2023-08-14

(8) I have not looked at the DataLinkImp source that's also present in the repo. If you think this ought to become a document, please extract it to a different repo; ivoatex is not designed to support two documents in one repo.

You are right. The note repo will be created in github.com/ivoa. -- FrancoisBonnarel - 2023-08-18

I've also collected a few rather editorial changes in https://github.com/ivoa-std/DataLink/pull/108



Comments from TCG member during the RFC/TCG Review Period: 2023-04-21 - 2023-05-18

WG chairs or vice chairs must read the Document, provide comments if any (including on topics not directly linked to the Group matters) or indicate that they have no comment.

IG chairs or vice chairs are also encouraged to do the same, althought their inputs are not compulsory.

TCG Chair & Vice Chair

Applications Working Group

No comment on the document, we appreciate the presence of examples that clarify the usage and implementation

--
Datalink is used and usage will increase for external webservice like simulated data, output format that are not in IVOA (Hapi Timeseries, OGC format ...)

May be change the datalink page with examples of implementation
refer to the datalink page in the document.
encourage working/interest groups to put examples as Markus did

Data Access Layer Working Group

Some minor edits only, otherwise this update looks sound.

  1. Section references would be useful in the v1.1 changes list - PR#105 raised and merged
  2. Some minor grammar updates - PR #107 raised

-- JamesDempsey - 2023-07-03

Data Model Working Group

I have a rather rudimentary understanding of DataLink, VOSI and DALI, so there are some details that I'm glossing over in my read.

I don't see any real issues/conflicts with the DM group work. However, I have 2 points/questions to raise:

  1. local_semantics: This is an identifier from a local vocabulary to help identify/select rows at a finer level than possible with just the other tags (semantics, content_type, content_qualifier). I'm guessing this is for something like ObsCore 's dataproduct_subtype. My question is that I don't really understand what the value is... is it just the tag? or URI for the local vocabulary + tag? The example serializations are no help since the DaCHs ones seem to resolve into a pretty format and I can't see the actual datalink content, and the CADC examples don't populate this field. I'd like to see, either in the document or examples, something more concrete.
I don't think authors discussed this point too much. IMHO both status would be acceptable. simple terms are enough to associate the results, but local vocab URI + tag allow to link the term to definitions and relationships, so it's reacher. Examples wil be given in the text. FrancoisBonnarel - 2023-08-18
  1. Product Type vocabulary: This directly affects the DM group, it'd be used in the Dataset model and ObsCore could be updated to use it as well. The link in the standard resolves to a 2021 version of the vocabulary. At the interop, a 2023 version was discussed which looked like it had some issues. Which vocabulary would support this REC?
    1. 2023-07-28: The referenced vocabulary now resolves to a version dated 2023-06-26 (though the event-list discussion was just going on this week). The elements and definitions in this list appear compatible with DM group usage in the ObsCore and Dataset models. -- MarkCresitelloDittmar - 2023-07-28
Yes, the product-type ivoa vocabulary is what should be used from now onwards in dataset DM, next version of ObsCore as well as DataLink content_qualifier or maybe also registry standards. -- FrancoisBonnarel - 2023-08-18

-- MarkCresitelloDittmar - 2023-06-21

Grid & Web Services Working Group

Registry Working Group

Semantics Working Group

Data Curation & Preservation Interest Group

Education Interest Group

Knowledge Discovery Interest Group

Operations Interest Group

I don't strictly speaking speak for Operations IG as of this week, but since I did most of the review before my term expired, I'll fill it in here; the TCG can decide whether this counts as an Ops endorsement or not.

As one of the authors I'm basically happy with this document, but I will draw attention to one or two issues.

  • Section 2.2 defines the standardID for this standard as ivo://ivoa.net/std/DataLink#links-1.0, followed by the comment "Note this is applicable to endpoints following any version 1.* of the DataLink standard, to avoid backward compatibility problems." In my opinion the backward compatibility problems are not sufficient to justify this choice, and the minor version should be reflected in this standard ID, i.e. it should be "...#links-1.1". This has been discussed in the open github Issue #96, and other authors seem to agree. A fix will require at least an update to the StandardsRegExt record, and also changes in the document to places where the key is referenced, including Section 2.2 and, especially, Section 3.3.1 as well as related example text. This change would amongst other things make it possible for validators to check which minor version they are supposed to be validating against. PatDowler has volunteered to write a Pull Request addressing this issue, but I can have a go if he doesn't.
  • Section 3.3.1 REQUIRES an INFO defining a suitable standardID for links response tables. The example shows such an INFO element as a child of the RESOURCE/@type="results" element, but it's not clear what restrictions there are on the location - does it have to go there, or can it be elsewhere in the VOTable? This should be clarified. If it's not required to be a child of the results resource, the example text in this section should probably be cut down.
This is done in consistency with DALI. DALI seems to insist that INFO elements should be in the primary RESOURCE (name="results"), and that other RESOURCEs may be in the VOTable. We may be more explicit on this. see next PR -- FrancoisBonnarel - 2023-08-18
  • Section 3.2.2: The final sentence says "No other additional parameters or client handling are allowed." I don't understand what is meant here. Should this sentence be removed?
See my answer to Markus above. And next PR. -- FrancoisBonnarel - 2023-08-18
  • As mentioned in Issue #82, the recommended MIME type application/x-votable+xml;content=datalink is using a content-type parameter for VOTable not endorsed in the VOTable standard, which is a bit questionable; but this is not new in this version of DataLink and it will hopefully be addressed in VOTable 1.5, so turning a blind eye is probably OK.

-- MarkTaylor - 2023-05-15

Radio Astronomy Interest Group

Solar System Interest Group

Theory Interest Group

Time Domain Interest Group

Standards and Processes Committee


TCG Vote : 2023-05-19 - 2023-06-01

If you have minor comments (typos) on the last version of the document please indicate it in the Comments column of the table and post them in the TCG comments section above with the date.

Group Yes No Abstain Comments
TCG        
Apps *      
DAL *      
DM        
GWS        
Registry        
Semantics        
DCP *      
Edu        
Changed:
<
<
KDIG        
>
>
KDIG *      
 
Ops        
Radio        
SSIG        
Theory        
TD        
<nop>StdProc        



<!--
* Set ALLOWTOPICRENAME = TWikiAdminGroup
-->

META FILEATTACHMENT attachment="AladinDesktopDataLink1-1.png" attr="" comment="" date="1681920946" name="AladinDesktopDataLink1-1.png" path="AladinDesktopDataLink1-1.png" size="1238442" user="FrancoisBonnarel" version="1"

Revision 212023-11-08 - GillesLandais

 
META TOPICPARENT name="IvoaTCG"

DataLink 1.1 Proposed Recommendation: Request for Comments


Introduction


DataLink describes the linking of data discovery metadata to access to the data itself, further detailed metadata, related resources, and to services that perform operations on the data.

The main changes in v1.1 are

  • Generalize by adding use cases for links to content other than data files
  • VOSI-availability and VOSI-capabilities endpoints are now optional
  • Service descriptors can include exampleURL and contentType param(s), as well as DESCRIPTION, name, etc...
  • Added optional link_auth and link_authorized to signal whether authentication is necessary to use the link
  • INFO element with standardID mandatory in {links} response
  • Added content_qualifier FIELD to inform on the nature of the link target
  • Added local_semantics to identify similar links in the same DataLink service for different IDs
  • Mechanisms to recognize {links} endpoints outside ObsCore

Latest version of DataLink can be found at:

The GitHub repository for issues and source can be found at: Detailed discussion towards 1-1 can also be found on this ivoa twiki page ( last update by FrancoisBonnarel - 2023-05-04):

Reference Interoperable Implementations

Server side

GAVO implemented the following changes = content_qualifier and local_semantics, service descriptor additional features such as DESCRIPTION? name, content_type, etc....

As a matter of example, a couple of links response for various obscore/sia/ssa tables in GAVO server

All these are combined with various semantics or content_type values

-- MarkusDemleitner - 2023-07-10


CADC has implemented the following in https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops:

  • INFO element with standardID in links response
  • new optional fields in links response: local_semantics (no content yet but can be populated with default vocab in most cases), content_qualifier (no content, not likely to use), link_auth, link_authorized
  • contentType param in service descriptors (where applicable)
IRIS image: https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/IRIS?f212h000/IRAS-25um shows link_auth=optional and link_authorized=true because one can authenticate but the data is public.

new CFHT data: anonymous use of https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/CFHT?2773629/2773629o shows link_auth=optional and link_authorized=false because the data is still proprietary and the caller is anonymous; if an authorized user makes the call they will see authorized=true. It's hard to demonstrate that for a general audience.

The core CADC implementation is available as a library (cadc-datalink-server) in MavenCentral with source code at https://github.com/opencadc/dal.git; the caom2-specific logic is available in a library (caom2-datalink-server) with source at https://github.com/opencadc/caom2service.git -- the core lib is also used in ALMA DataLink service but may not yet be released with the latest features.

Client side

TOPCAT v4.8-8 and later displays additional features in service descriptor and makes use of additional links table FIELD such as content_qualifier, authorization and local_semantics; The tool behavior is adapted to the content of these new FIELDS. For example Activation Actions suggested by topcat for the links not only depend on content_type but also content_qualifier, and local_semantics is used to guess which link a user is interested in based on previous selections.

AladinDesktop is going to adapt to those new features too. (see prototype screenshot)

The CADC DownloadManager (https://github.com/opencadc/apps.git) includes a simple DataLink client class so it can resolve publisherID values into 1..* URLs for download; this code hasn't changed as a result of DataLink -1.1. The CADC AdvancedSearch web portal makes calls to the above caom2ops/datalink service to find previews and download info for each row (publisherID): it makes use of link_authorized to decide to display the download options (or not), which prevents users from selecting downloads/links when they are not authorized and the request will be rejected later.

Implementations Validators

The following validators are available for DataLink

  • datalinklint which is part of STILTS. From STILTS v3.4-8 there is a version parameter that can be set to 1.0 or 1.1 (effectively defaults to v1.1)
  • Show my DataLink which is part of DaCHS



Comments from the IVOA Community during RFC/TCG review period: 2023-04-21 - 2023-05-18

The comments from the TCG members during the RFC/TCG review should be included in the next section.

In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your Wiki Name so that authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.

Additional discussion about any of the comments or responses can be conducted on the WG mailing list. However, please be sure to enter your initial comments here for full consideration in any future revisions of this document

Community Comment by Markus Demleitner

(1) The standard ID: I'm pretty sure we discussed that before, but I'm really unsure how we came to the conclusion that even Datalink 1.1 still has the ivoid of ivo://ivoa.net/std/DataLink#links-1.0.

Yes, we have to do crazy stuff like that for the schema URIs due to the way XML element names are compared. But there is in general no analogous need with ivoids, because we control the rules how to compare them in what situations.

Does anyone remember why we went for links-1.0 here? If not, I'd suggest links-1. I volunteer for adding a brief explanation about how clients should disregard the minor version for normal operations.

(2) I am entirely unhappy with section 3.1.1, starting with its title, which probably should be something like "Datalinks in VOTable columns". And then the first paragraph should probably say something more concrete like perhaps "Columns containing datalinks SHOULD be marked with a UCD of X.Y.Z and a LINK-typed child in its FIELD like this:

Deleted:
<
<
 
   <LINK whatever="blabla"/>"

And the second paragraph I'd say doesn't belong here at all (it could go to, perhaps, 1.2.7 or a use case discussing datalinks as primary results if we think we need to be explicit about this).

There are use cases behind this. When datalink links response is hooked to table rows outside the context of ObsTAP /SIA2 how do we generate/recognize the DataLink URL ?

Of course we can use the Service Descriptor with the single ID parameter if the DataLink can be parametrized by and "id" from one of the columns. But in that case the descriptor would be doing exactly the same than the LINK element proposed here as included in the appropriate FIELD and is much less verbose. And it's pretty correct VOTable standard. The FIELD itself should not be described by a datalink ucd because it's probably generally an id.

The second paragraph refers to use cases where the URL is not built from the content of one FIELD and when the URL is ad hoc and should be the content of a FIELD. Using the same utypes than the one used in Obscore responses seems reasonable. This is for example adapted to SIA1 or SSA responses. I think this has nothing to do with recursive datalink.

We may try to rephrase all this if this is unclear, but the intent has to be kept.

-- FrancoisBonnarel - 2023-08-12

(3) In 3.2, is says:

If an error occurs while processing an ID value, there \rfcshould\ be at least one row for that ID value and an error\_message

The way the pyvo datalink client is written, we have to make that an unconditional MUST, or pyvo will keep requesting any failing ID (and frankly I'm unsure how else to implement this given multi-ID and overflows): it will only remove an ID of its list of ids to query if it gets at least one row back for it. Perhaps:

A service MUST return at least one row for each ID passed in.

[ceterum censeo we should have let ID be single-valued; it would have made everything soo much simpler and nothing really much harder/slower]

(4) 3.2.2, second paragraph: I had to puzzle quite a bit about this, starting with wondering what a "dereferenceable URL" might be. I'd suggest to replace the entire paragraph with "Access URLs may have fragment parts, which could, for instance, refer to id-ed elements within XML documents or extensions within FITS files. As in URIs in general, the interpretation of a fragment identifier depends on the media type."

"dereferanceable" was used in the sense that it can be fully accessed by http. Which is not the case in URN in general or URL with fragments. For the latter the client is supposed to interpret the fragment. See: https://en.wikipedia.org/wiki/URI_fragment

Apart from that I agree with your rephrasing. FrancoisBonnarel - 2023-08-14"

This will also drop the "No other additional parameters or client handling are allowed." -- if this forbids query strings on access URIs, I'd strongly disagree. If this means something else, we'd have to write that something else.

In version 1.0 we could read:

"The access_url column contains a URL to download a single resource. The URL in this column must be usable as-is with no additional parameters or client handling; it can be a link to a dynamic resource (e.g. preview generation)."

This statement was not consistent with the allowance of fragments, hence the new statement. I can rephrase it in the upcoming PR. -- FrancoisBonnarel - 2023-08-14

(5) In my editoral PR, I've dropped a paragraph on semantics for error_message rows. This is now sufficiently addressed above that passage.

(6) sect. 3.2.9 content_qualifier: I think we should at least name the motivating use case a bit more precisely here, as in, perhaps: "It aids clients in presenting to the user the same sort of link as they go from one dataset to another within a service. For instance, suppose a service serves both continuum and line cubes. Using content_qualifier, users can configure their clients such that, as they change to a new data set, they always see the line cube even when the semantics and content\_type columns agree for both types of data." Or so.

OK for this change. I will adopt it in the next PR. -- FrancoisBonnarel - 2023-08-14

(7) Sect. 4.8: Sorry, you cannot introduce a utype ("adhoc:this") in a section called "Example: X". If you are really, really sure these "self-describing" things are useful, put them into a section of their own.

Me, I've frankly never really understood where you want to go with this, and I think there's no implementation doing any of this, so perhaps we should drop the whole thing. But if we don't drop it and somehow nonchalantly mention it in an example, at least don't introduce a new utype here. What's wrong with the name="this" you had before? You see, having two different mechanisms for what to my knowledge hasn't been implemented even once seems a bit excessive.

When dropping adhoc:this, don't forget that it is referenced in sect 4.1.

The autodescription motivation may be explained earlier in the section. For "adhoc:this" I remember Pat advocating for this. If we motivate earlier then we can restrict to a pure example here. FrancoisBonnarel - 2023-08-14

(8) I have not looked at the DataLinkImp source that's also present in the repo. If you think this ought to become a document, please extract it to a different repo; ivoatex is not designed to support two documents in one repo.

You are right. The note repo will be created in github.com/ivoa. -- FrancoisBonnarel - 2023-08-18

I've also collected a few rather editorial changes in https://github.com/ivoa-std/DataLink/pull/108



Comments from TCG member during the RFC/TCG Review Period: 2023-04-21 - 2023-05-18

WG chairs or vice chairs must read the Document, provide comments if any (including on topics not directly linked to the Group matters) or indicate that they have no comment.

IG chairs or vice chairs are also encouraged to do the same, althought their inputs are not compulsory.

TCG Chair & Vice Chair

Applications Working Group

No comment on the document, we appreciate the presence of examples that clarify the usage and implementation

--
Datalink is used and usage will increase for external webservice like simulated data, output format that are not in IVOA (Hapi Timeseries, OGC format ...)

May be change the datalink page with examples of implementation
refer to the datalink page in the document.
encourage working/interest groups to put examples as Markus did

Data Access Layer Working Group

Some minor edits only, otherwise this update looks sound.

  1. Section references would be useful in the v1.1 changes list - PR#105 raised and merged
  2. Some minor grammar updates - PR #107 raised

-- JamesDempsey - 2023-07-03

Data Model Working Group

I have a rather rudimentary understanding of DataLink, VOSI and DALI, so there are some details that I'm glossing over in my read.

I don't see any real issues/conflicts with the DM group work. However, I have 2 points/questions to raise:

  1. local_semantics: This is an identifier from a local vocabulary to help identify/select rows at a finer level than possible with just the other tags (semantics, content_type, content_qualifier). I'm guessing this is for something like ObsCore 's dataproduct_subtype. My question is that I don't really understand what the value is... is it just the tag? or URI for the local vocabulary + tag? The example serializations are no help since the DaCHs ones seem to resolve into a pretty format and I can't see the actual datalink content, and the CADC examples don't populate this field. I'd like to see, either in the document or examples, something more concrete.
I don't think authors discussed this point too much. IMHO both status would be acceptable. simple terms are enough to associate the results, but local vocab URI + tag allow to link the term to definitions and relationships, so it's reacher. Examples wil be given in the text. FrancoisBonnarel - 2023-08-18
  1. Product Type vocabulary: This directly affects the DM group, it'd be used in the Dataset model and ObsCore could be updated to use it as well. The link in the standard resolves to a 2021 version of the vocabulary. At the interop, a 2023 version was discussed which looked like it had some issues. Which vocabulary would support this REC?
    1. 2023-07-28: The referenced vocabulary now resolves to a version dated 2023-06-26 (though the event-list discussion was just going on this week). The elements and definitions in this list appear compatible with DM group usage in the ObsCore and Dataset models. -- MarkCresitelloDittmar - 2023-07-28
Yes, the product-type ivoa vocabulary is what should be used from now onwards in dataset DM, next version of ObsCore as well as DataLink content_qualifier or maybe also registry standards. -- FrancoisBonnarel - 2023-08-18

-- MarkCresitelloDittmar - 2023-06-21

Grid & Web Services Working Group

Registry Working Group

Semantics Working Group

Data Curation & Preservation Interest Group

Deleted:
<
<
Some details:
  • in section2 , I guess that IVOA type of the link is not a free text: is it possible to add a link to authorized words?
  • in section3, can you explain more the parameter "nodeIDfile' ?
  • section5 is not clear for me.
 

Education Interest Group

Knowledge Discovery Interest Group

Operations Interest Group

I don't strictly speaking speak for Operations IG as of this week, but since I did most of the review before my term expired, I'll fill it in here; the TCG can decide whether this counts as an Ops endorsement or not.

As one of the authors I'm basically happy with this document, but I will draw attention to one or two issues.

  • Section 2.2 defines the standardID for this standard as ivo://ivoa.net/std/DataLink#links-1.0, followed by the comment "Note this is applicable to endpoints following any version 1.* of the DataLink standard, to avoid backward compatibility problems." In my opinion the backward compatibility problems are not sufficient to justify this choice, and the minor version should be reflected in this standard ID, i.e. it should be "...#links-1.1". This has been discussed in the open github Issue #96, and other authors seem to agree. A fix will require at least an update to the StandardsRegExt record, and also changes in the document to places where the key is referenced, including Section 2.2 and, especially, Section 3.3.1 as well as related example text. This change would amongst other things make it possible for validators to check which minor version they are supposed to be validating against. PatDowler has volunteered to write a Pull Request addressing this issue, but I can have a go if he doesn't.
  • Section 3.3.1 REQUIRES an INFO defining a suitable standardID for links response tables. The example shows such an INFO element as a child of the RESOURCE/@type="results" element, but it's not clear what restrictions there are on the location - does it have to go there, or can it be elsewhere in the VOTable? This should be clarified. If it's not required to be a child of the results resource, the example text in this section should probably be cut down.
This is done in consistency with DALI. DALI seems to insist that INFO elements should be in the primary RESOURCE (name="results"), and that other RESOURCEs may be in the VOTable. We may be more explicit on this. see next PR -- FrancoisBonnarel - 2023-08-18
  • Section 3.2.2: The final sentence says "No other additional parameters or client handling are allowed." I don't understand what is meant here. Should this sentence be removed?
See my answer to Markus above. And next PR. -- FrancoisBonnarel - 2023-08-18
  • As mentioned in Issue #82, the recommended MIME type application/x-votable+xml;content=datalink is using a content-type parameter for VOTable not endorsed in the VOTable standard, which is a bit questionable; but this is not new in this version of DataLink and it will hopefully be addressed in VOTable 1.5, so turning a blind eye is probably OK.

-- MarkTaylor - 2023-05-15

Radio Astronomy Interest Group

Solar System Interest Group

Theory Interest Group

Time Domain Interest Group

Standards and Processes Committee


TCG Vote : 2023-05-19 - 2023-06-01

If you have minor comments (typos) on the last version of the document please indicate it in the Comments column of the table and post them in the TCG comments section above with the date.

Group Yes No Abstain Comments
TCG        
Apps *      
DAL *      
DM        
GWS        
Registry        
Semantics        
DCP *      
Edu        
KDIG        
Ops        
Radio        
SSIG        
Theory        
TD        
<nop>StdProc        



<!--
* Set ALLOWTOPICRENAME = TWikiAdminGroup
-->

META FILEATTACHMENT attachment="AladinDesktopDataLink1-1.png" attr="" comment="" date="1681920946" name="AladinDesktopDataLink1-1.png" path="AladinDesktopDataLink1-1.png" size="1238442" user="FrancoisBonnarel" version="1"

Revision 202023-11-07 - GillesLandais

 
META TOPICPARENT name="IvoaTCG"

DataLink 1.1 Proposed Recommendation: Request for Comments


Introduction


DataLink describes the linking of data discovery metadata to access to the data itself, further detailed metadata, related resources, and to services that perform operations on the data.

The main changes in v1.1 are

  • Generalize by adding use cases for links to content other than data files
  • VOSI-availability and VOSI-capabilities endpoints are now optional
  • Service descriptors can include exampleURL and contentType param(s), as well as DESCRIPTION, name, etc...
  • Added optional link_auth and link_authorized to signal whether authentication is necessary to use the link
  • INFO element with standardID mandatory in {links} response
  • Added content_qualifier FIELD to inform on the nature of the link target
  • Added local_semantics to identify similar links in the same DataLink service for different IDs
  • Mechanisms to recognize {links} endpoints outside ObsCore

Latest version of DataLink can be found at:

The GitHub repository for issues and source can be found at: Detailed discussion towards 1-1 can also be found on this ivoa twiki page ( last update by FrancoisBonnarel - 2023-05-04):

Reference Interoperable Implementations

Server side

GAVO implemented the following changes = content_qualifier and local_semantics, service descriptor additional features such as DESCRIPTION? name, content_type, etc....

As a matter of example, a couple of links response for various obscore/sia/ssa tables in GAVO server

All these are combined with various semantics or content_type values

-- MarkusDemleitner - 2023-07-10


CADC has implemented the following in https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops:

  • INFO element with standardID in links response
  • new optional fields in links response: local_semantics (no content yet but can be populated with default vocab in most cases), content_qualifier (no content, not likely to use), link_auth, link_authorized
  • contentType param in service descriptors (where applicable)
IRIS image: https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/IRIS?f212h000/IRAS-25um shows link_auth=optional and link_authorized=true because one can authenticate but the data is public.

new CFHT data: anonymous use of https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/CFHT?2773629/2773629o shows link_auth=optional and link_authorized=false because the data is still proprietary and the caller is anonymous; if an authorized user makes the call they will see authorized=true. It's hard to demonstrate that for a general audience.

The core CADC implementation is available as a library (cadc-datalink-server) in MavenCentral with source code at https://github.com/opencadc/dal.git; the caom2-specific logic is available in a library (caom2-datalink-server) with source at https://github.com/opencadc/caom2service.git -- the core lib is also used in ALMA DataLink service but may not yet be released with the latest features.

Client side

TOPCAT v4.8-8 and later displays additional features in service descriptor and makes use of additional links table FIELD such as content_qualifier, authorization and local_semantics; The tool behavior is adapted to the content of these new FIELDS. For example Activation Actions suggested by topcat for the links not only depend on content_type but also content_qualifier, and local_semantics is used to guess which link a user is interested in based on previous selections.

AladinDesktop is going to adapt to those new features too. (see prototype screenshot)

The CADC DownloadManager (https://github.com/opencadc/apps.git) includes a simple DataLink client class so it can resolve publisherID values into 1..* URLs for download; this code hasn't changed as a result of DataLink -1.1. The CADC AdvancedSearch web portal makes calls to the above caom2ops/datalink service to find previews and download info for each row (publisherID): it makes use of link_authorized to decide to display the download options (or not), which prevents users from selecting downloads/links when they are not authorized and the request will be rejected later.

Implementations Validators

The following validators are available for DataLink

  • datalinklint which is part of STILTS. From STILTS v3.4-8 there is a version parameter that can be set to 1.0 or 1.1 (effectively defaults to v1.1)
  • Show my DataLink which is part of DaCHS



Comments from the IVOA Community during RFC/TCG review period: 2023-04-21 - 2023-05-18

The comments from the TCG members during the RFC/TCG review should be included in the next section.

In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your Wiki Name so that authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.

Additional discussion about any of the comments or responses can be conducted on the WG mailing list. However, please be sure to enter your initial comments here for full consideration in any future revisions of this document

Community Comment by Markus Demleitner

Changed:
<
<
(1) The standard ID: I'm pretty sure we discussed that before, but I'm
>
>
(1) The standard ID: I'm pretty sure we discussed that before, but I'm really unsure how we came to the conclusion that even Datalink 1.1 still has the ivoid of ivo://ivoa.net/std/DataLink#links-1.0.
Deleted:
<
<
really unsure how we came to the conclusion that even Datalink 1.1 still has the ivoid of ivo://ivoa.net/std/DataLink#links-1.0.
 
Changed:
<
<
Yes, we have to do crazy stuff like that for the schema URIs due to the
>
>
Yes, we have to do crazy stuff like that for the schema URIs due to the way XML element names are compared. But there is in general no analogous need with ivoids, because we control the rules how to compare them in what situations.
Deleted:
<
<
way XML element names are compared. But there is in general no analogous need with ivoids, because we control the rules how to compare them in what situations.
 
Changed:
<
<
Does anyone remember why we went for links-1.0 here? If not, I'd
>
>
Does anyone remember why we went for links-1.0 here? If not, I'd suggest links-1. I volunteer for adding a brief explanation about how clients should disregard the minor version for normal operations.
Deleted:
<
<
suggest links-1. I volunteer for adding a brief explanation about how clients should disregard the minor version for normal operations.
 
Changed:
<
<
(2) I am entirely unhappy with section 3.1.1, starting with its title, which probably should be something like "Datalinks in VOTable columns".
>
>
(2) I am entirely unhappy with section 3.1.1, starting with its title, which probably should be something like "Datalinks in VOTable columns". And then the first paragraph should probably say something more concrete like perhaps "Columns containing datalinks SHOULD be marked with a UCD of X.Y.Z and a LINK-typed child in its FIELD like this:
Deleted:
<
<
And then the first paragraph should probably say something more concrete like perhaps "Columns containing datalinks SHOULD be marked with a UCD of X.Y.Z and a LINK-typed child in its FIELD like this:
 
   <LINK whatever="blabla"/>"
Changed:
<
<
And the second paragraph I'd say doesn't belong here at all (it could go
>
>
And the second paragraph I'd say doesn't belong here at all (it could go to, perhaps, 1.2.7 or a use case discussing datalinks as primary results if we think we need to be explicit about this).
Deleted:
<
<
to, perhaps, 1.2.7 or a use case discussing datalinks as primary results if we think we need to be explicit about this).
  There are use cases behind this. When datalink links response is hooked to table rows outside the context of ObsTAP /SIA2 how do we generate/recognize the DataLink URL ?
Changed:
<
<
Of course we can use the Service Descriptor with the single ID parameter if the DataLink can be parametrized by and "id" from one of the columns. But in that case the descriptor would be doing exactly the same than the LINK element proposed here as included in the appropriate FIELD and is much less verbose. And it's pretty correct VOTable standard. The FIELD itself should not be described by a datalink ucd because it's probably generally an id.
>
>
Of course we can use the Service Descriptor with the single ID parameter if the DataLink can be parametrized by and "id" from one of the columns. But in that case the descriptor would be doing exactly the same than the LINK element proposed here as included in the appropriate FIELD and is much less verbose. And it's pretty correct VOTable standard. The FIELD itself should not be described by a datalink ucd because it's probably generally an id.
 
Changed:
<
<
The second paragraph refers to use cases where the URL is not built from the content of one FIELD and when the URL is ad hoc and should be the content of a FIELD. Using the same utypes than the one used in Obscore responses seems reasonable. This is for example adapted to SIA1 or SSA responses. I think this has nothing to do with recursive datalink.
>
>
The second paragraph refers to use cases where the URL is not built from the content of one FIELD and when the URL is ad hoc and should be the content of a FIELD. Using the same utypes than the one used in Obscore responses seems reasonable. This is for example adapted to SIA1 or SSA responses. I think this has nothing to do with recursive datalink.
  We may try to rephrase all this if this is unclear, but the intent has to be kept.

-- FrancoisBonnarel - 2023-08-12

(3) In 3.2, is says:

Changed:
<
<
>
>
If an error occurs while processing an ID value, there \rfcshould\ be at least one row for that ID value and an error\_message
Deleted:
<
<
If an error occurs while processing an ID value, there \rfcshould\ be at least one row for that ID value and an error\_message
 
Changed:
<
<
The way the pyvo datalink client is written, we have to make that an
>
>
The way the pyvo datalink client is written, we have to make that an unconditional MUST, or pyvo will keep requesting any failing ID (and frankly I'm unsure how else to implement this given multi-ID and overflows): it will only remove an ID of its list of ids to query if it gets at least one row back for it. Perhaps:
Deleted:
<
<
unconditional MUST, or pyvo will keep requesting any failing ID (and frankly I'm unsure how else to implement this given multi-ID and overflows): it will only remove an ID of its list of ids to query if it gets at least one row back for it. Perhaps:
 
Changed:
<
<
>
>
A service MUST return at least one row for each ID passed in.
Deleted:
<
<
A service MUST return at least one row for each ID passed in.
 
Changed:
<
<
[ceterum censeo we should have let ID be single-valued; it would have
>
>
[ceterum censeo we should have let ID be single-valued; it would have made everything soo much simpler and nothing really much harder/slower]
Deleted:
<
<
made everything soo much simpler and nothing really much harder/slower]
 
Changed:
<
<
(4) 3.2.2, second paragraph: I had to puzzle quite a bit about this,
>
>
(4) 3.2.2, second paragraph: I had to puzzle quite a bit about this, starting with wondering what a "dereferenceable URL" might be. I'd suggest to replace the entire paragraph with "Access URLs may have fragment parts, which could, for instance, refer to id-ed elements within XML documents or extensions within FITS files. As in URIs in general, the interpretation of a fragment identifier depends on the media type."
Deleted:
<
<
starting with wondering what a "dereferenceable URL" might be. I'd suggest to replace the entire paragraph with "Access URLs may have fragment parts, which could, for instance, refer to id-ed elements within XML documents or extensions within FITS files. As in URIs in general, the interpretation of a fragment identifier depends on the media type."
  "dereferanceable" was used in the sense that it can be fully accessed by http. Which is not the case in URN in general or URL with fragments. For the latter the client is supposed to interpret the fragment. See: https://en.wikipedia.org/wiki/URI_fragment

Apart from that I agree with your rephrasing. FrancoisBonnarel - 2023-08-14"

Changed:
<
<
This will also drop the "No other additional parameters or client
>
>
This will also drop the "No other additional parameters or client handling are allowed." -- if this forbids query strings on access URIs, I'd strongly disagree. If this means something else, we'd have to write that something else.
Deleted:
<
<
handling are allowed." -- if this forbids query strings on access URIs, I'd strongly disagree. If this means something else, we'd have to write that something else.
  In version 1.0 we could read:
Changed:
<
<
"The access_url column contains a URL to download a single resource. The URL
>
>
"The access_url column contains a URL to download a single resource. The URL in this column must be usable as-is with no additional parameters or client handling; it can be a link to a dynamic resource (e.g. preview generation)."
Deleted:
<
<
in this column must be usable as-is with no additional parameters or client handling; it can be a link to a dynamic resource (e.g. preview generation)."
  This statement was not consistent with the allowance of fragments, hence the new statement. I can rephrase it in the upcoming PR. -- FrancoisBonnarel - 2023-08-14
Changed:
<
<
(5) In my editoral PR, I've dropped a paragraph on semantics for
>
>
(5) In my editoral PR, I've dropped a paragraph on semantics for error_message rows. This is now sufficiently addressed above that passage.
Deleted:
<
<
error_message rows. This is now sufficiently addressed above that passage.
 
Changed:
<
<
(6) sect. 3.2.9 content_qualifier: I think we should at least name the
>
>
(6) sect. 3.2.9 content_qualifier: I think we should at least name the motivating use case a bit more precisely here, as in, perhaps: "It aids clients in presenting to the user the same sort of link as they go from one dataset to another within a service. For instance, suppose a service serves both continuum and line cubes. Using content_qualifier, users can configure their clients such that, as they change to a new data set, they always see the line cube even when the semantics and content\_type columns agree for both types of data." Or so.
Deleted:
<
<
motivating use case a bit more precisely here, as in, perhaps: "It aids clients in presenting to the user the same sort of link as they go from one dataset to another within a service. For instance, suppose a service serves both continuum and line cubes. Using content_qualifier, users can configure their clients such that, as they change to a new data set, they always see the line cube even when the semantics and content\_type columns agree for both types of data." Or so.
  OK for this change. I will adopt it in the next PR. -- FrancoisBonnarel - 2023-08-14
Changed:
<
<
(7) Sect. 4.8: Sorry, you cannot introduce a utype ("adhoc:this") in a
>
>
(7) Sect. 4.8: Sorry, you cannot introduce a utype ("adhoc:this") in a section called "Example: X". If you are really, really sure these "self-describing" things are useful, put them into a section of their own.
Deleted:
<
<
section called "Example: X". If you are really, really sure these "self-describing" things are useful, put them into a section of their own.
 
Changed:
<
<
Me, I've frankly never really understood where you want to go with this,
>
>
Me, I've frankly never really understood where you want to go with this, and I think there's no implementation doing any of this, so perhaps we should drop the whole thing. But if we don't drop it and somehow nonchalantly mention it in an example, at least don't introduce a new utype here. What's wrong with the name="this" you had before? You see, having two different mechanisms for what to my knowledge hasn't been implemented even once seems a bit excessive.
Deleted:
<
<
and I think there's no implementation doing any of this, so perhaps we should drop the whole thing. But if we don't drop it and somehow nonchalantly mention it in an example, at least don't introduce a new utype here. What's wrong with the name="this" you had before? You see, having two different mechanisms for what to my knowledge hasn't been implemented even once seems a bit excessive.
 
Changed:
<
<
When dropping adhoc:this, don't forget that it is referenced in sect
>
>
When dropping adhoc:this, don't forget that it is referenced in sect 4.1.
Deleted:
<
<
4.1.
 
Changed:
<
<
The autodescription motivation may be explained earlier in the section. For "adhoc:this" I remember Pat advocating for this. If we motivate earlier then we can restrict to a pure example here. FrancoisBonnarel - 2023-08-14
>
>
The autodescription motivation may be explained earlier in the section. For "adhoc:this" I remember Pat advocating for this. If we motivate earlier then we can restrict to a pure example here. FrancoisBonnarel - 2023-08-14
 
Changed:
<
<
(8) I have not looked at the DataLinkImp source that's also present in
>
>
(8) I have not looked at the DataLinkImp source that's also present in the repo. If you think this ought to become a document, please extract it to a different repo; ivoatex is not designed to support two documents in one repo.
Deleted:
<
<
the repo. If you think this ought to become a document, please extract it to a different repo; ivoatex is not designed to support two documents in one repo.
  You are right. The note repo will be created in github.com/ivoa. -- FrancoisBonnarel - 2023-08-18

I've also collected a few rather editorial changes in https://github.com/ivoa-std/DataLink/pull/108



Comments from TCG member during the RFC/TCG Review Period: 2023-04-21 - 2023-05-18

WG chairs or vice chairs must read the Document, provide comments if any (including on topics not directly linked to the Group matters) or indicate that they have no comment.

IG chairs or vice chairs are also encouraged to do the same, althought their inputs are not compulsory.

TCG Chair & Vice Chair

Applications Working Group

No comment on the document, we appreciate the presence of examples that clarify the usage and implementation

--
Datalink is used and usage will increase for external webservice like simulated data, output format that are not in IVOA (Hapi Timeseries, OGC format ...)

May be change the datalink page with examples of implementation
refer to the datalink page in the document.
encourage working/interest groups to put examples as Markus did

Data Access Layer Working Group

Some minor edits only, otherwise this update looks sound.

  1. Section references would be useful in the v1.1 changes list - PR#105 raised and merged
  2. Some minor grammar updates - PR #107 raised

-- JamesDempsey - 2023-07-03

Data Model Working Group

I have a rather rudimentary understanding of DataLink, VOSI and DALI, so there are some details that I'm glossing over in my read.

I don't see any real issues/conflicts with the DM group work. However, I have 2 points/questions to raise:

  1. local_semantics: This is an identifier from a local vocabulary to help identify/select rows at a finer level than possible with just the other tags (semantics, content_type, content_qualifier). I'm guessing this is for something like ObsCore 's dataproduct_subtype. My question is that I don't really understand what the value is... is it just the tag? or URI for the local vocabulary + tag? The example serializations are no help since the DaCHs ones seem to resolve into a pretty format and I can't see the actual datalink content, and the CADC examples don't populate this field. I'd like to see, either in the document or examples, something more concrete.
I don't think authors discussed this point too much. IMHO both status would be acceptable. simple terms are enough to associate the results, but local vocab URI + tag allow to link the term to definitions and relationships, so it's reacher. Examples wil be given in the text. FrancoisBonnarel - 2023-08-18
  1. Product Type vocabulary: This directly affects the DM group, it'd be used in the Dataset model and ObsCore could be updated to use it as well. The link in the standard resolves to a 2021 version of the vocabulary. At the interop, a 2023 version was discussed which looked like it had some issues. Which vocabulary would support this REC?
    1. 2023-07-28: The referenced vocabulary now resolves to a version dated 2023-06-26 (though the event-list discussion was just going on this week). The elements and definitions in this list appear compatible with DM group usage in the ObsCore and Dataset models. -- MarkCresitelloDittmar - 2023-07-28
Changed:
<
<
Yes, the product-type ivoa vocabulary is what should be used from now onwards in dataset DM, next version of ObsCore as well as DataLink content_qualifier or maybe also registry standards. -- FrancoisBonnarel - 2023-08-18
>
>
Yes, the product-type ivoa vocabulary is what should be used from now onwards in dataset DM, next version of ObsCore as well as DataLink content_qualifier or maybe also registry standards. -- FrancoisBonnarel - 2023-08-18
  -- MarkCresitelloDittmar - 2023-06-21

Grid & Web Services Working Group

Registry Working Group

Semantics Working Group

Data Curation & Preservation Interest Group

Added:
>
>
Some details:
  • in section2 , I guess that IVOA type of the link is not a free text: is it possible to add a link to authorized words?
  • in section3, can you explain more the parameter "nodeIDfile' ?
  • section5 is not clear for me.
 

Education Interest Group

Knowledge Discovery Interest Group

Operations Interest Group

I don't strictly speaking speak for Operations IG as of this week, but since I did most of the review before my term expired, I'll fill it in here; the TCG can decide whether this counts as an Ops endorsement or not.

As one of the authors I'm basically happy with this document, but I will draw attention to one or two issues.

  • Section 2.2 defines the standardID for this standard as ivo://ivoa.net/std/DataLink#links-1.0, followed by the comment "Note this is applicable to endpoints following any version 1.* of the DataLink standard, to avoid backward compatibility problems." In my opinion the backward compatibility problems are not sufficient to justify this choice, and the minor version should be reflected in this standard ID, i.e. it should be "...#links-1.1". This has been discussed in the open github Issue #96, and other authors seem to agree. A fix will require at least an update to the StandardsRegExt record, and also changes in the document to places where the key is referenced, including Section 2.2 and, especially, Section 3.3.1 as well as related example text. This change would amongst other things make it possible for validators to check which minor version they are supposed to be validating against. PatDowler has volunteered to write a Pull Request addressing this issue, but I can have a go if he doesn't.
  • Section 3.3.1 REQUIRES an INFO defining a suitable standardID for links response tables. The example shows such an INFO element as a child of the RESOURCE/@type="results" element, but it's not clear what restrictions there are on the location - does it have to go there, or can it be elsewhere in the VOTable? This should be clarified. If it's not required to be a child of the results resource, the example text in this section should probably be cut down.
This is done in consistency with DALI. DALI seems to insist that INFO elements should be in the primary RESOURCE (name="results"), and that other RESOURCEs may be in the VOTable. We may be more explicit on this. see next PR -- FrancoisBonnarel - 2023-08-18
  • Section 3.2.2: The final sentence says "No other additional parameters or client handling are allowed." I don't understand what is meant here. Should this sentence be removed?
Changed:
<
<
See my answer to Markus above. And next PR. -- FrancoisBonnarel - 2023-08-18
>
>
See my answer to Markus above. And next PR. -- FrancoisBonnarel - 2023-08-18
 
  • As mentioned in Issue #82, the recommended MIME type application/x-votable+xml;content=datalink is using a content-type parameter for VOTable not endorsed in the VOTable standard, which is a bit questionable; but this is not new in this version of DataLink and it will hopefully be addressed in VOTable 1.5, so turning a blind eye is probably OK.

-- MarkTaylor - 2023-05-15

Radio Astronomy Interest Group

Solar System Interest Group

Theory Interest Group

Time Domain Interest Group

Standards and Processes Committee


TCG Vote : 2023-05-19 - 2023-06-01

If you have minor comments (typos) on the last version of the document please indicate it in the Comments column of the table and post them in the TCG comments section above with the date.

Group Yes No Abstain Comments
TCG        
Apps *      
DAL *      
DM        
GWS        
Registry        
Semantics        
Changed:
<
<
DCP        
>
>
DCP *      
 
Edu        
KDIG        
Ops        
Radio        
SSIG        
Theory        
TD        
<nop>StdProc        



<!--
* Set ALLOWTOPICRENAME = TWikiAdminGroup
-->

META FILEATTACHMENT attachment="AladinDesktopDataLink1-1.png" attr="" comment="" date="1681920946" name="AladinDesktopDataLink1-1.png" path="AladinDesktopDataLink1-1.png" size="1238442" user="FrancoisBonnarel" version="1"

Revision 192023-09-11 - PierreLeSidaner

 
META TOPICPARENT name="IvoaTCG"

DataLink 1.1 Proposed Recommendation: Request for Comments


Introduction


DataLink describes the linking of data discovery metadata to access to the data itself, further detailed metadata, related resources, and to services that perform operations on the data.

The main changes in v1.1 are

  • Generalize by adding use cases for links to content other than data files
  • VOSI-availability and VOSI-capabilities endpoints are now optional
  • Service descriptors can include exampleURL and contentType param(s), as well as DESCRIPTION, name, etc...
  • Added optional link_auth and link_authorized to signal whether authentication is necessary to use the link
  • INFO element with standardID mandatory in {links} response
  • Added content_qualifier FIELD to inform on the nature of the link target
  • Added local_semantics to identify similar links in the same DataLink service for different IDs
  • Mechanisms to recognize {links} endpoints outside ObsCore

Latest version of DataLink can be found at:

The GitHub repository for issues and source can be found at: Detailed discussion towards 1-1 can also be found on this ivoa twiki page ( last update by FrancoisBonnarel - 2023-05-04):

Reference Interoperable Implementations

Server side

GAVO implemented the following changes = content_qualifier and local_semantics, service descriptor additional features such as DESCRIPTION? name, content_type, etc....

As a matter of example, a couple of links response for various obscore/sia/ssa tables in GAVO server

All these are combined with various semantics or content_type values

-- MarkusDemleitner - 2023-07-10


CADC has implemented the following in https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops:

  • INFO element with standardID in links response
  • new optional fields in links response: local_semantics (no content yet but can be populated with default vocab in most cases), content_qualifier (no content, not likely to use), link_auth, link_authorized
  • contentType param in service descriptors (where applicable)
IRIS image: https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/IRIS?f212h000/IRAS-25um shows link_auth=optional and link_authorized=true because one can authenticate but the data is public.

new CFHT data: anonymous use of https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/CFHT?2773629/2773629o shows link_auth=optional and link_authorized=false because the data is still proprietary and the caller is anonymous; if an authorized user makes the call they will see authorized=true. It's hard to demonstrate that for a general audience.

The core CADC implementation is available as a library (cadc-datalink-server) in MavenCentral with source code at https://github.com/opencadc/dal.git; the caom2-specific logic is available in a library (caom2-datalink-server) with source at https://github.com/opencadc/caom2service.git -- the core lib is also used in ALMA DataLink service but may not yet be released with the latest features.

Client side

TOPCAT v4.8-8 and later displays additional features in service descriptor and makes use of additional links table FIELD such as content_qualifier, authorization and local_semantics; The tool behavior is adapted to the content of these new FIELDS. For example Activation Actions suggested by topcat for the links not only depend on content_type but also content_qualifier, and local_semantics is used to guess which link a user is interested in based on previous selections.

AladinDesktop is going to adapt to those new features too. (see prototype screenshot)

The CADC DownloadManager (https://github.com/opencadc/apps.git) includes a simple DataLink client class so it can resolve publisherID values into 1..* URLs for download; this code hasn't changed as a result of DataLink -1.1. The CADC AdvancedSearch web portal makes calls to the above caom2ops/datalink service to find previews and download info for each row (publisherID): it makes use of link_authorized to decide to display the download options (or not), which prevents users from selecting downloads/links when they are not authorized and the request will be rejected later.

Implementations Validators

The following validators are available for DataLink

  • datalinklint which is part of STILTS. From STILTS v3.4-8 there is a version parameter that can be set to 1.0 or 1.1 (effectively defaults to v1.1)
  • Show my DataLink which is part of DaCHS



Comments from the IVOA Community during RFC/TCG review period: 2023-04-21 - 2023-05-18

The comments from the TCG members during the RFC/TCG review should be included in the next section.

In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your Wiki Name so that authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.

Additional discussion about any of the comments or responses can be conducted on the WG mailing list. However, please be sure to enter your initial comments here for full consideration in any future revisions of this document

Community Comment by Markus Demleitner

(1) The standard ID: I'm pretty sure we discussed that before, but I'm really unsure how we came to the conclusion that even Datalink 1.1 still has the ivoid of ivo://ivoa.net/std/DataLink#links-1.0.

Yes, we have to do crazy stuff like that for the schema URIs due to the way XML element names are compared. But there is in general no analogous need with ivoids, because we control the rules how to compare them in what situations.

Does anyone remember why we went for links-1.0 here? If not, I'd suggest links-1. I volunteer for adding a brief explanation about how clients should disregard the minor version for normal operations.

(2) I am entirely unhappy with section 3.1.1, starting with its title, which probably should be something like "Datalinks in VOTable columns". And then the first paragraph should probably say something more concrete like perhaps "Columns containing datalinks SHOULD be marked with a UCD of X.Y.Z and a LINK-typed child in its FIELD like this:

Deleted:
<
<
 
   <LINK whatever="blabla"/>"

And the second paragraph I'd say doesn't belong here at all (it could go to, perhaps, 1.2.7 or a use case discussing datalinks as primary results

Changed:
<
<
if we think we need to be explicit about this).
>
>
if we think we need to be explicit about this).
 
Changed:
<
<
There are use cases behind this. When datalink links response is hooked to table rows outside the context of ObsTAP/SIA2 how do we generate/recognize the DataLink URL ?
>
>
There are use cases behind this. When datalink links response is hooked to table rows outside the context of ObsTAP /SIA2 how do we generate/recognize the DataLink URL ?
  Of course we can use the Service Descriptor with the single ID parameter if the DataLink can be parametrized by and "id" from one of the columns. But in that case the descriptor would be doing exactly the same than the LINK element proposed here as included in the appropriate FIELD and is much less verbose. And it's pretty correct VOTable standard. The FIELD itself should not be described by a datalink ucd because it's probably generally an id.

The second paragraph refers to use cases where the URL is not built from the content of one FIELD and when the URL is ad hoc and should be the content of a FIELD. Using the same utypes than the one used in Obscore responses seems reasonable. This is for example adapted to SIA1 or SSA responses. I think this has nothing to do with recursive datalink.

We may try to rephrase all this if this is unclear, but the intent has to be kept.

-- FrancoisBonnarel - 2023-08-12

(3) In 3.2, is says:

If an error occurs while processing an ID value, there \rfcshould\ be at least one row for that ID value and an error\_message

The way the pyvo datalink client is written, we have to make that an unconditional MUST, or pyvo will keep requesting any failing ID (and frankly I'm unsure how else to implement this given multi-ID and overflows): it will only remove an ID of its list of ids to query if it gets at least one row back for it. Perhaps:

A service MUST return at least one row for each ID passed in.

[ceterum censeo we should have let ID be single-valued; it would have made everything soo much simpler and nothing really much harder/slower]

(4) 3.2.2, second paragraph: I had to puzzle quite a bit about this, starting with wondering what a "dereferenceable URL" might be. I'd suggest to replace the entire paragraph with "Access URLs may have fragment parts, which could, for instance, refer to id-ed elements within XML documents or extensions within FITS files. As in URIs in general, the interpretation of a fragment identifier depends on the media type."

"dereferanceable" was used in the sense that it can be fully accessed by http. Which is not the case in URN in general or URL with fragments. For the latter the client is supposed to interpret the fragment. See: https://en.wikipedia.org/wiki/URI_fragment

Apart from that I agree with your rephrasing. FrancoisBonnarel - 2023-08-14"

This will also drop the "No other additional parameters or client handling are allowed." -- if this forbids query strings on access URIs, I'd strongly disagree. If this means something else, we'd have to write that something else.

In version 1.0 we could read:

"The access_url column contains a URL to download a single resource. The URL in this column must be usable as-is with no additional parameters or client handling; it can be a link to a dynamic resource (e.g. preview generation)."

This statement was not consistent with the allowance of fragments, hence the new statement. I can rephrase it in the upcoming PR. -- FrancoisBonnarel - 2023-08-14

(5) In my editoral PR, I've dropped a paragraph on semantics for error_message rows. This is now sufficiently addressed above that passage.

(6) sect. 3.2.9 content_qualifier: I think we should at least name the motivating use case a bit more precisely here, as in, perhaps: "It aids clients in presenting to the user the same sort of link as they go from one dataset to another within a service. For instance, suppose a service serves both continuum and line cubes. Using content_qualifier, users can configure their clients such that, as they change to a new data set, they always see the line cube even when the semantics and content\_type columns agree for both types of data." Or so.

OK for this change. I will adopt it in the next PR. -- FrancoisBonnarel - 2023-08-14

(7) Sect. 4.8: Sorry, you cannot introduce a utype ("adhoc:this") in a section called "Example: X". If you are really, really sure these "self-describing" things are useful, put them into a section of their own.

Me, I've frankly never really understood where you want to go with this, and I think there's no implementation doing any of this, so perhaps we should drop the whole thing. But if we don't drop it and somehow nonchalantly mention it in an example, at least don't introduce a new utype here. What's wrong with the name="this" you had before? You see, having two different mechanisms for what to my knowledge hasn't been implemented even once seems a bit excessive.

When dropping adhoc:this, don't forget that it is referenced in sect 4.1.

The autodescription motivation may be explained earlier in the section. For "adhoc:this" I remember Pat advocating for this. If we motivate earlier then we can restrict to a pure example here. FrancoisBonnarel - 2023-08-14

(8) I have not looked at the DataLinkImp source that's also present in the repo. If you think this ought to become a document, please extract it to a different repo; ivoatex is not designed to support two documents in one repo.

You are right. The note repo will be created in github.com/ivoa. -- FrancoisBonnarel - 2023-08-18

I've also collected a few rather editorial changes in https://github.com/ivoa-std/DataLink/pull/108



Comments from TCG member during the RFC/TCG Review Period: 2023-04-21 - 2023-05-18

WG chairs or vice chairs must read the Document, provide comments if any (including on topics not directly linked to the Group matters) or indicate that they have no comment.

IG chairs or vice chairs are also encouraged to do the same, althought their inputs are not compulsory.

TCG Chair & Vice Chair

Applications Working Group

Added:
>
>
No comment on the document, we appreciate the presence of examples that clarify the usage and implementation

--
Datalink is used and usage will increase for external webservice like simulated data, output format that are not in IVOA (Hapi Timeseries, OGC format ...)

May be change the datalink page with examples of implementation
refer to the datalink page in the document.
encourage working/interest groups to put examples as Markus did

 

Data Access Layer Working Group

Some minor edits only, otherwise this update looks sound.

  1. Section references would be useful in the v1.1 changes list - PR#105 raised and merged
  2. Some minor grammar updates - PR #107 raised

-- JamesDempsey - 2023-07-03

Data Model Working Group

I have a rather rudimentary understanding of DataLink, VOSI and DALI, so there are some details that I'm glossing over in my read.

I don't see any real issues/conflicts with the DM group work. However, I have 2 points/questions to raise:

  1. local_semantics: This is an identifier from a local vocabulary to help identify/select rows at a finer level than possible with just the other tags (semantics, content_type, content_qualifier). I'm guessing this is for something like ObsCore 's dataproduct_subtype. My question is that I don't really understand what the value is... is it just the tag? or URI for the local vocabulary + tag? The example serializations are no help since the DaCHs ones seem to resolve into a pretty format and I can't see the actual datalink content, and the CADC examples don't populate this field. I'd like to see, either in the document or examples, something more concrete.
Changed:
<
<
I don't think authors discussed this point too much. IMHO both status would be acceptable. simple terms are enough to associate the results, but local vocab URI + tag allow to link the term to definitions and relationships, so it's reacher. Examples wil be given in the text. FrancoisBonnarel - 2023-08-18
>
>
I don't think authors discussed this point too much. IMHO both status would be acceptable. simple terms are enough to associate the results, but local vocab URI + tag allow to link the term to definitions and relationships, so it's reacher. Examples wil be given in the text. FrancoisBonnarel - 2023-08-18
 
  1. Product Type vocabulary: This directly affects the DM group, it'd be used in the Dataset model and ObsCore could be updated to use it as well. The link in the standard resolves to a 2021 version of the vocabulary. At the interop, a 2023 version was discussed which looked like it had some issues. Which vocabulary would support this REC?
    1. 2023-07-28: The referenced vocabulary now resolves to a version dated 2023-06-26 (though the event-list discussion was just going on this week). The elements and definitions in this list appear compatible with DM group usage in the ObsCore and Dataset models. -- MarkCresitelloDittmar - 2023-07-28
Yes, the product-type ivoa vocabulary is what should be used from now onwards in dataset DM, next version of ObsCore as well as DataLink content_qualifier or maybe also registry standards. -- FrancoisBonnarel - 2023-08-18

-- MarkCresitelloDittmar - 2023-06-21

Grid & Web Services Working Group

Registry Working Group

Semantics Working Group

Data Curation & Preservation Interest Group

Education Interest Group

Knowledge Discovery Interest Group

Operations Interest Group

I don't strictly speaking speak for Operations IG as of this week, but since I did most of the review before my term expired, I'll fill it in here; the TCG can decide whether this counts as an Ops endorsement or not.

As one of the authors I'm basically happy with this document, but I will draw attention to one or two issues.

  • Section 2.2 defines the standardID for this standard as ivo://ivoa.net/std/DataLink#links-1.0, followed by the comment "Note this is applicable to endpoints following any version 1.* of the DataLink standard, to avoid backward compatibility problems." In my opinion the backward compatibility problems are not sufficient to justify this choice, and the minor version should be reflected in this standard ID, i.e. it should be "...#links-1.1". This has been discussed in the open github Issue #96, and other authors seem to agree. A fix will require at least an update to the StandardsRegExt record, and also changes in the document to places where the key is referenced, including Section 2.2 and, especially, Section 3.3.1 as well as related example text. This change would amongst other things make it possible for validators to check which minor version they are supposed to be validating against. PatDowler has volunteered to write a Pull Request addressing this issue, but I can have a go if he doesn't.
  • Section 3.3.1 REQUIRES an INFO defining a suitable standardID for links response tables. The example shows such an INFO element as a child of the RESOURCE/@type="results" element, but it's not clear what restrictions there are on the location - does it have to go there, or can it be elsewhere in the VOTable? This should be clarified. If it's not required to be a child of the results resource, the example text in this section should probably be cut down.
This is done in consistency with DALI. DALI seems to insist that INFO elements should be in the primary RESOURCE (name="results"), and that other RESOURCEs may be in the VOTable. We may be more explicit on this. see next PR -- FrancoisBonnarel - 2023-08-18
  • Section 3.2.2: The final sentence says "No other additional parameters or client handling are allowed." I don't understand what is meant here. Should this sentence be removed?
See my answer to Markus above. And next PR. -- FrancoisBonnarel - 2023-08-18
  • As mentioned in Issue #82, the recommended MIME type application/x-votable+xml;content=datalink is using a content-type parameter for VOTable not endorsed in the VOTable standard, which is a bit questionable; but this is not new in this version of DataLink and it will hopefully be addressed in VOTable 1.5, so turning a blind eye is probably OK.

-- MarkTaylor - 2023-05-15

Radio Astronomy Interest Group

Solar System Interest Group

Theory Interest Group

Time Domain Interest Group

Standards and Processes Committee


TCG Vote : 2023-05-19 - 2023-06-01

If you have minor comments (typos) on the last version of the document please indicate it in the Comments column of the table and post them in the TCG comments section above with the date.

Group Yes No Abstain Comments
TCG        
Changed:
<
<
Apps        
>
>
Apps *      
 
DAL *      
DM        
GWS        
Registry        
Semantics        
DCP        
Edu        
KDIG        
Ops        
Radio        
SSIG        
Theory        
TD        
<nop>StdProc        



<!--
* Set ALLOWTOPICRENAME = TWikiAdminGroup
-->

META FILEATTACHMENT attachment="AladinDesktopDataLink1-1.png" attr="" comment="" date="1681920946" name="AladinDesktopDataLink1-1.png" path="AladinDesktopDataLink1-1.png" size="1238442" user="FrancoisBonnarel" version="1"

Revision 182023-08-18 - FrancoisBonnarel

 
META TOPICPARENT name="IvoaTCG"

DataLink 1.1 Proposed Recommendation: Request for Comments


Introduction


DataLink describes the linking of data discovery metadata to access to the data itself, further detailed metadata, related resources, and to services that perform operations on the data.

The main changes in v1.1 are

  • Generalize by adding use cases for links to content other than data files
  • VOSI-availability and VOSI-capabilities endpoints are now optional
  • Service descriptors can include exampleURL and contentType param(s), as well as DESCRIPTION, name, etc...
  • Added optional link_auth and link_authorized to signal whether authentication is necessary to use the link
  • INFO element with standardID mandatory in {links} response
  • Added content_qualifier FIELD to inform on the nature of the link target
  • Added local_semantics to identify similar links in the same DataLink service for different IDs
  • Mechanisms to recognize {links} endpoints outside ObsCore

Latest version of DataLink can be found at:

The GitHub repository for issues and source can be found at: Detailed discussion towards 1-1 can also be found on this ivoa twiki page ( last update by FrancoisBonnarel - 2023-05-04):

Reference Interoperable Implementations

Server side

GAVO implemented the following changes = content_qualifier and local_semantics, service descriptor additional features such as DESCRIPTION? name, content_type, etc....

As a matter of example, a couple of links response for various obscore/sia/ssa tables in GAVO server

All these are combined with various semantics or content_type values

-- MarkusDemleitner - 2023-07-10


CADC has implemented the following in https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops:

  • INFO element with standardID in links response
  • new optional fields in links response: local_semantics (no content yet but can be populated with default vocab in most cases), content_qualifier (no content, not likely to use), link_auth, link_authorized
  • contentType param in service descriptors (where applicable)
IRIS image: https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/IRIS?f212h000/IRAS-25um shows link_auth=optional and link_authorized=true because one can authenticate but the data is public.

new CFHT data: anonymous use of https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/CFHT?2773629/2773629o shows link_auth=optional and link_authorized=false because the data is still proprietary and the caller is anonymous; if an authorized user makes the call they will see authorized=true. It's hard to demonstrate that for a general audience.

The core CADC implementation is available as a library (cadc-datalink-server) in MavenCentral with source code at https://github.com/opencadc/dal.git; the caom2-specific logic is available in a library (caom2-datalink-server) with source at https://github.com/opencadc/caom2service.git -- the core lib is also used in ALMA DataLink service but may not yet be released with the latest features.

Client side

TOPCAT v4.8-8 and later displays additional features in service descriptor and makes use of additional links table FIELD such as content_qualifier, authorization and local_semantics; The tool behavior is adapted to the content of these new FIELDS. For example Activation Actions suggested by topcat for the links not only depend on content_type but also content_qualifier, and local_semantics is used to guess which link a user is interested in based on previous selections.

AladinDesktop is going to adapt to those new features too. (see prototype screenshot)

The CADC DownloadManager (https://github.com/opencadc/apps.git) includes a simple DataLink client class so it can resolve publisherID values into 1..* URLs for download; this code hasn't changed as a result of DataLink -1.1. The CADC AdvancedSearch web portal makes calls to the above caom2ops/datalink service to find previews and download info for each row (publisherID): it makes use of link_authorized to decide to display the download options (or not), which prevents users from selecting downloads/links when they are not authorized and the request will be rejected later.

Implementations Validators

The following validators are available for DataLink

  • datalinklint which is part of STILTS. From STILTS v3.4-8 there is a version parameter that can be set to 1.0 or 1.1 (effectively defaults to v1.1)
  • Show my DataLink which is part of DaCHS



Comments from the IVOA Community during RFC/TCG review period: 2023-04-21 - 2023-05-18

The comments from the TCG members during the RFC/TCG review should be included in the next section.

In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your Wiki Name so that authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.

Additional discussion about any of the comments or responses can be conducted on the WG mailing list. However, please be sure to enter your initial comments here for full consideration in any future revisions of this document

Community Comment by Markus Demleitner

(1) The standard ID: I'm pretty sure we discussed that before, but I'm really unsure how we came to the conclusion that even Datalink 1.1 still has the ivoid of ivo://ivoa.net/std/DataLink#links-1.0.

Yes, we have to do crazy stuff like that for the schema URIs due to the way XML element names are compared. But there is in general no analogous need with ivoids, because we control the rules how to compare them in what situations.

Does anyone remember why we went for links-1.0 here? If not, I'd suggest links-1. I volunteer for adding a brief explanation about how clients should disregard the minor version for normal operations.

(2) I am entirely unhappy with section 3.1.1, starting with its title, which probably should be something like "Datalinks in VOTable columns". And then the first paragraph should probably say something more concrete like perhaps "Columns containing datalinks SHOULD be marked with a UCD of X.Y.Z and a LINK-typed child in its FIELD like this:

   <LINK whatever="blabla"/>"

And the second paragraph I'd say doesn't belong here at all (it could go to, perhaps, 1.2.7 or a use case discussing datalinks as primary results if we think we need to be explicit about this).

There are use cases behind this. When datalink links response is hooked to table rows outside the context of ObsTAP/SIA2 how do we generate/recognize the DataLink URL ?

Of course we can use the Service Descriptor with the single ID parameter if the DataLink can be parametrized by and "id" from one of the columns. But in that case the descriptor would be doing exactly the same than the LINK element proposed here as included in the appropriate FIELD and is much less verbose. And it's pretty correct VOTable standard. The FIELD itself should not be described by a datalink ucd because it's probably generally an id.

The second paragraph refers to use cases where the URL is not built from the content of one FIELD and when the URL is ad hoc and should be the content of a FIELD. Using the same utypes than the one used in Obscore responses seems reasonable. This is for example adapted to SIA1 or SSA responses. I think this has nothing to do with recursive datalink.

We may try to rephrase all this if this is unclear, but the intent has to be kept.

Changed:
<
<
FrancoisBonnarel - 2023-08-12
>
>
-- FrancoisBonnarel - 2023-08-12
  (3) In 3.2, is says:

If an error occurs while processing an ID value, there \rfcshould\ be at least one row for that ID value and an error\_message

The way the pyvo datalink client is written, we have to make that an unconditional MUST, or pyvo will keep requesting any failing ID (and frankly I'm unsure how else to implement this given multi-ID and overflows): it will only remove an ID of its list of ids to query if it gets at least one row back for it. Perhaps:

A service MUST return at least one row for each ID passed in.

[ceterum censeo we should have let ID be single-valued; it would have made everything soo much simpler and nothing really much harder/slower]

(4) 3.2.2, second paragraph: I had to puzzle quite a bit about this, starting with wondering what a "dereferenceable URL" might be. I'd suggest to replace the entire paragraph with "Access URLs may have fragment parts, which could, for instance, refer to id-ed elements within XML documents or extensions within FITS files. As in URIs in general, the interpretation of a fragment identifier depends on the media type."

"dereferanceable" was used in the sense that it can be fully accessed by http. Which is not the case in URN in general or URL with fragments. For the latter the client is supposed to interpret the fragment. See: https://en.wikipedia.org/wiki/URI_fragment

Apart from that I agree with your rephrasing. FrancoisBonnarel - 2023-08-14"

This will also drop the "No other additional parameters or client handling are allowed." -- if this forbids query strings on access URIs, I'd strongly disagree. If this means something else, we'd have to write that something else.

In version 1.0 we could read:

"The access_url column contains a URL to download a single resource. The URL in this column must be usable as-is with no additional parameters or client handling; it can be a link to a dynamic resource (e.g. preview generation)."

Changed:
<
<
This statement was not consistent with the allowance of fragments, hence the new statement. I can rephrase it in the upcoming PR. FrancoisBonnarel - 2023-08-14
>
>
This statement was not consistent with the allowance of fragments, hence the new statement. I can rephrase it in the upcoming PR. -- FrancoisBonnarel - 2023-08-14
  (5) In my editoral PR, I've dropped a paragraph on semantics for error_message rows. This is now sufficiently addressed above that passage.

(6) sect. 3.2.9 content_qualifier: I think we should at least name the motivating use case a bit more precisely here, as in, perhaps: "It aids clients in presenting to the user the same sort of link as they go from one dataset to another within a service. For instance, suppose a service serves both continuum and line cubes. Using content_qualifier, users can configure their clients such that, as they change to a new data set, they always see the line cube even when the semantics and content\_type columns agree for both types of data." Or so.

Changed:
<
<
OK for this change. I will adopt it in the next PR. FrancoisBonnarel - 2023-08-14
>
>
OK for this change. I will adopt it in the next PR. -- FrancoisBonnarel - 2023-08-14
  (7) Sect. 4.8: Sorry, you cannot introduce a utype ("adhoc:this") in a section called "Example: X". If you are really, really sure these "self-describing" things are useful, put them into a section of their own.

Me, I've frankly never really understood where you want to go with this, and I think there's no implementation doing any of this, so perhaps we should drop the whole thing. But if we don't drop it and somehow nonchalantly mention it in an example, at least don't introduce a new utype here. What's wrong with the name="this" you had before? You see, having two different mechanisms for what to my knowledge hasn't been implemented even once seems a bit excessive.

When dropping adhoc:this, don't forget that it is referenced in sect 4.1.

The autodescription motivation may be explained earlier in the section. For "adhoc:this" I remember Pat advocating for this. If we motivate earlier then we can restrict to a pure example here. FrancoisBonnarel - 2023-08-14

(8) I have not looked at the DataLinkImp source that's also present in the repo. If you think this ought to become a document, please extract it to a different repo; ivoatex is not designed to support two documents in one repo.

Added:
>
>
You are right. The note repo will be created in github.com/ivoa. -- FrancoisBonnarel - 2023-08-18
 I've also collected a few rather editorial changes in https://github.com/ivoa-std/DataLink/pull/108



Comments from TCG member during the RFC/TCG Review Period: 2023-04-21 - 2023-05-18

WG chairs or vice chairs must read the Document, provide comments if any (including on topics not directly linked to the Group matters) or indicate that they have no comment.

IG chairs or vice chairs are also encouraged to do the same, althought their inputs are not compulsory.

TCG Chair & Vice Chair

Applications Working Group

Data Access Layer Working Group

Some minor edits only, otherwise this update looks sound.

  1. Section references would be useful in the v1.1 changes list - PR#105 raised and merged
  2. Some minor grammar updates - PR #107 raised

-- JamesDempsey - 2023-07-03

Data Model Working Group

I have a rather rudimentary understanding of DataLink, VOSI and DALI, so there are some details that I'm glossing over in my read.

I don't see any real issues/conflicts with the DM group work. However, I have 2 points/questions to raise:

  1. local_semantics: This is an identifier from a local vocabulary to help identify/select rows at a finer level than possible with just the other tags (semantics, content_type, content_qualifier). I'm guessing this is for something like ObsCore 's dataproduct_subtype. My question is that I don't really understand what the value is... is it just the tag? or URI for the local vocabulary + tag? The example serializations are no help since the DaCHs ones seem to resolve into a pretty format and I can't see the actual datalink content, and the CADC examples don't populate this field. I'd like to see, either in the document or examples, something more concrete.
Added:
>
>
I don't think authors discussed this point too much. IMHO both status would be acceptable. simple terms are enough to associate the results, but local vocab URI + tag allow to link the term to definitions and relationships, so it's reacher. Examples wil be given in the text. FrancoisBonnarel - 2023-08-18
 
  1. Product Type vocabulary: This directly affects the DM group, it'd be used in the Dataset model and ObsCore could be updated to use it as well. The link in the standard resolves to a 2021 version of the vocabulary. At the interop, a 2023 version was discussed which looked like it had some issues. Which vocabulary would support this REC?
    1. 2023-07-28: The referenced vocabulary now resolves to a version dated 2023-06-26 (though the event-list discussion was just going on this week). The elements and definitions in this list appear compatible with DM group usage in the ObsCore and Dataset models. -- MarkCresitelloDittmar - 2023-07-28
Added:
>
>
Yes, the product-type ivoa vocabulary is what should be used from now onwards in dataset DM, next version of ObsCore as well as DataLink content_qualifier or maybe also registry standards. -- FrancoisBonnarel - 2023-08-18
  -- MarkCresitelloDittmar - 2023-06-21

Grid & Web Services Working Group

Registry Working Group

Semantics Working Group

Data Curation & Preservation Interest Group

Education Interest Group

Knowledge Discovery Interest Group

Operations Interest Group

I don't strictly speaking speak for Operations IG as of this week, but since I did most of the review before my term expired, I'll fill it in here; the TCG can decide whether this counts as an Ops endorsement or not.

As one of the authors I'm basically happy with this document, but I will draw attention to one or two issues.

  • Section 2.2 defines the standardID for this standard as ivo://ivoa.net/std/DataLink#links-1.0, followed by the comment "Note this is applicable to endpoints following any version 1.* of the DataLink standard, to avoid backward compatibility problems." In my opinion the backward compatibility problems are not sufficient to justify this choice, and the minor version should be reflected in this standard ID, i.e. it should be "...#links-1.1". This has been discussed in the open github Issue #96, and other authors seem to agree. A fix will require at least an update to the StandardsRegExt record, and also changes in the document to places where the key is referenced, including Section 2.2 and, especially, Section 3.3.1 as well as related example text. This change would amongst other things make it possible for validators to check which minor version they are supposed to be validating against. PatDowler has volunteered to write a Pull Request addressing this issue, but I can have a go if he doesn't.
  • Section 3.3.1 REQUIRES an INFO defining a suitable standardID for links response tables. The example shows such an INFO element as a child of the RESOURCE/@type="results" element, but it's not clear what restrictions there are on the location - does it have to go there, or can it be elsewhere in the VOTable? This should be clarified. If it's not required to be a child of the results resource, the example text in this section should probably be cut down.
Added:
>
>
This is done in consistency with DALI. DALI seems to insist that INFO elements should be in the primary RESOURCE (name="results"), and that other RESOURCEs may be in the VOTable. We may be more explicit on this. see next PR -- FrancoisBonnarel - 2023-08-18
 
  • Section 3.2.2: The final sentence says "No other additional parameters or client handling are allowed." I don't understand what is meant here. Should this sentence be removed?
Added:
>
>
See my answer to Markus above. And next PR. -- FrancoisBonnarel - 2023-08-18
 
  • As mentioned in Issue #82, the recommended MIME type application/x-votable+xml;content=datalink is using a content-type parameter for VOTable not endorsed in the VOTable standard, which is a bit questionable; but this is not new in this version of DataLink and it will hopefully be addressed in VOTable 1.5, so turning a blind eye is probably OK.

-- MarkTaylor - 2023-05-15

Radio Astronomy Interest Group

Solar System Interest Group

Theory Interest Group

Time Domain Interest Group

Standards and Processes Committee


TCG Vote : 2023-05-19 - 2023-06-01

If you have minor comments (typos) on the last version of the document please indicate it in the Comments column of the table and post them in the TCG comments section above with the date.

Group Yes No Abstain Comments
TCG        
Apps        
DAL *      
DM        
GWS        
Registry        
Semantics        
DCP        
Edu        
KDIG        
Ops        
Radio        
SSIG        
Theory        
TD        
<nop>StdProc        



<!--
* Set ALLOWTOPICRENAME = TWikiAdminGroup
-->

META FILEATTACHMENT attachment="AladinDesktopDataLink1-1.png" attr="" comment="" date="1681920946" name="AladinDesktopDataLink1-1.png" path="AladinDesktopDataLink1-1.png" size="1238442" user="FrancoisBonnarel" version="1"

Revision 172023-08-14 - FrancoisBonnarel

 
META TOPICPARENT name="IvoaTCG"

DataLink 1.1 Proposed Recommendation: Request for Comments


Introduction


DataLink describes the linking of data discovery metadata to access to the data itself, further detailed metadata, related resources, and to services that perform operations on the data.

The main changes in v1.1 are

  • Generalize by adding use cases for links to content other than data files
  • VOSI-availability and VOSI-capabilities endpoints are now optional
  • Service descriptors can include exampleURL and contentType param(s), as well as DESCRIPTION, name, etc...
  • Added optional link_auth and link_authorized to signal whether authentication is necessary to use the link
  • INFO element with standardID mandatory in {links} response
  • Added content_qualifier FIELD to inform on the nature of the link target
  • Added local_semantics to identify similar links in the same DataLink service for different IDs
  • Mechanisms to recognize {links} endpoints outside ObsCore

Latest version of DataLink can be found at:

The GitHub repository for issues and source can be found at: Detailed discussion towards 1-1 can also be found on this ivoa twiki page ( last update by FrancoisBonnarel - 2023-05-04):

Reference Interoperable Implementations

Server side

GAVO implemented the following changes = content_qualifier and local_semantics, service descriptor additional features such as DESCRIPTION? name, content_type, etc....

As a matter of example, a couple of links response for various obscore/sia/ssa tables in GAVO server

All these are combined with various semantics or content_type values

-- MarkusDemleitner - 2023-07-10


CADC has implemented the following in https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops:

  • INFO element with standardID in links response
  • new optional fields in links response: local_semantics (no content yet but can be populated with default vocab in most cases), content_qualifier (no content, not likely to use), link_auth, link_authorized
  • contentType param in service descriptors (where applicable)
IRIS image: https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/IRIS?f212h000/IRAS-25um shows link_auth=optional and link_authorized=true because one can authenticate but the data is public.

new CFHT data: anonymous use of https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/CFHT?2773629/2773629o shows link_auth=optional and link_authorized=false because the data is still proprietary and the caller is anonymous; if an authorized user makes the call they will see authorized=true. It's hard to demonstrate that for a general audience.

The core CADC implementation is available as a library (cadc-datalink-server) in MavenCentral with source code at https://github.com/opencadc/dal.git; the caom2-specific logic is available in a library (caom2-datalink-server) with source at https://github.com/opencadc/caom2service.git -- the core lib is also used in ALMA DataLink service but may not yet be released with the latest features.

Client side

TOPCAT v4.8-8 and later displays additional features in service descriptor and makes use of additional links table FIELD such as content_qualifier, authorization and local_semantics; The tool behavior is adapted to the content of these new FIELDS. For example Activation Actions suggested by topcat for the links not only depend on content_type but also content_qualifier, and local_semantics is used to guess which link a user is interested in based on previous selections.

AladinDesktop is going to adapt to those new features too. (see prototype screenshot)

The CADC DownloadManager (https://github.com/opencadc/apps.git) includes a simple DataLink client class so it can resolve publisherID values into 1..* URLs for download; this code hasn't changed as a result of DataLink -1.1. The CADC AdvancedSearch web portal makes calls to the above caom2ops/datalink service to find previews and download info for each row (publisherID): it makes use of link_authorized to decide to display the download options (or not), which prevents users from selecting downloads/links when they are not authorized and the request will be rejected later.

Implementations Validators

The following validators are available for DataLink

  • datalinklint which is part of STILTS. From STILTS v3.4-8 there is a version parameter that can be set to 1.0 or 1.1 (effectively defaults to v1.1)
  • Show my DataLink which is part of DaCHS



Comments from the IVOA Community during RFC/TCG review period: 2023-04-21 - 2023-05-18

The comments from the TCG members during the RFC/TCG review should be included in the next section.

In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your Wiki Name so that authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.

Additional discussion about any of the comments or responses can be conducted on the WG mailing list. However, please be sure to enter your initial comments here for full consideration in any future revisions of this document

Community Comment by Markus Demleitner

(1) The standard ID: I'm pretty sure we discussed that before, but I'm really unsure how we came to the conclusion that even Datalink 1.1 still has the ivoid of ivo://ivoa.net/std/DataLink#links-1.0.

Yes, we have to do crazy stuff like that for the schema URIs due to the way XML element names are compared. But there is in general no analogous need with ivoids, because we control the rules how to compare them in what situations.

Does anyone remember why we went for links-1.0 here? If not, I'd suggest links-1. I volunteer for adding a brief explanation about how clients should disregard the minor version for normal operations.

(2) I am entirely unhappy with section 3.1.1, starting with its title, which probably should be something like "Datalinks in VOTable columns". And then the first paragraph should probably say something more concrete like perhaps "Columns containing datalinks SHOULD be marked with a UCD of X.Y.Z and a LINK-typed child in its FIELD like this:

   <LINK whatever="blabla"/>"

And the second paragraph I'd say doesn't belong here at all (it could go to, perhaps, 1.2.7 or a use case discussing datalinks as primary results if we think we need to be explicit about this).

There are use cases behind this. When datalink links response is hooked to table rows outside the context of ObsTAP/SIA2 how do we generate/recognize the DataLink URL ?

Of course we can use the Service Descriptor with the single ID parameter if the DataLink can be parametrized by and "id" from one of the columns. But in that case the descriptor would be doing exactly the same than the LINK element proposed here as included in the appropriate FIELD and is much less verbose. And it's pretty correct VOTable standard. The FIELD itself should not be described by a datalink ucd because it's probably generally an id.

The second paragraph refers to use cases where the URL is not built from the content of one FIELD and when the URL is ad hoc and should be the content of a FIELD. Using the same utypes than the one used in Obscore responses seems reasonable. This is for example adapted to SIA1 or SSA responses. I think this has nothing to do with recursive datalink.

We may try to rephrase all this if this is unclear, but the intent has to be kept.

FrancoisBonnarel - 2023-08-12

(3) In 3.2, is says:

If an error occurs while processing an ID value, there \rfcshould\ be at least one row for that ID value and an error\_message

The way the pyvo datalink client is written, we have to make that an unconditional MUST, or pyvo will keep requesting any failing ID (and frankly I'm unsure how else to implement this given multi-ID and overflows): it will only remove an ID of its list of ids to query if it gets at least one row back for it. Perhaps:

A service MUST return at least one row for each ID passed in.

[ceterum censeo we should have let ID be single-valued; it would have made everything soo much simpler and nothing really much harder/slower]

(4) 3.2.2, second paragraph: I had to puzzle quite a bit about this, starting with wondering what a "dereferenceable URL" might be. I'd suggest to replace the entire paragraph with "Access URLs may have fragment parts, which could, for instance, refer to id-ed elements within XML documents or extensions within FITS files. As in URIs in general, the interpretation of a fragment identifier depends on the media type."

"dereferanceable" was used in the sense that it can be fully accessed by http. Which is not the case in URN in general or URL with fragments. For the latter the client is supposed to interpret the fragment. See: https://en.wikipedia.org/wiki/URI_fragment

Changed:
<
<
apart from that I agree with your rephrasing. FrancoisBonnarel - 2023-08-14"
>
>
Apart from that I agree with your rephrasing. FrancoisBonnarel - 2023-08-14"
 
Deleted:
<
<
 This will also drop the "No other additional parameters or client handling are allowed." -- if this forbids query strings on access URIs, I'd strongly disagree. If this means something else, we'd have to write that something else.
Added:
>
>
In version 1.0 we could read:
 "The access_url column contains a URL to download a single resource. The URL in this column must be usable as-is with no additional parameters or client handling; it can be a link to a dynamic resource (e.g. preview generation)."
Added:
>
>
This statement was not consistent with the allowance of fragments, hence the new statement. I can rephrase it in the upcoming PR. FrancoisBonnarel - 2023-08-14
 (5) In my editoral PR, I've dropped a paragraph on semantics for error_message rows. This is now sufficiently addressed above that passage.

(6) sect. 3.2.9 content_qualifier: I think we should at least name the motivating use case a bit more precisely here, as in, perhaps: "It aids clients in presenting to the user the same sort of link as they go from one dataset to another within a service. For instance, suppose a service serves both continuum and line cubes. Using content_qualifier, users can configure their clients such that, as they change to a new data set, they always see the line cube even when the semantics and content\_type columns agree for both types of data." Or so.

Added:
>
>
OK for this change. I will adopt it in the next PR. FrancoisBonnarel - 2023-08-14
 (7) Sect. 4.8: Sorry, you cannot introduce a utype ("adhoc:this") in a section called "Example: X". If you are really, really sure these "self-describing" things are useful, put them into a section of their own.

Me, I've frankly never really understood where you want to go with this, and I think there's no implementation doing any of this, so perhaps we should drop the whole thing. But if we don't drop it and somehow nonchalantly mention it in an example, at least don't introduce a new utype here. What's wrong with the name="this" you had before? You see, having two different mechanisms for what to my knowledge hasn't been implemented even once seems a bit excessive.

When dropping adhoc:this, don't forget that it is referenced in sect 4.1.

Added:
>
>
The autodescription motivation may be explained earlier in the section. For "adhoc:this" I remember Pat advocating for this. If we motivate earlier then we can restrict to a pure example here. FrancoisBonnarel - 2023-08-14
  (8) I have not looked at the DataLinkImp source that's also present in the repo. If you think this ought to become a document, please extract it to a different repo; ivoatex is not designed to support two documents in one repo.

I've also collected a few rather editorial changes in https://github.com/ivoa-std/DataLink/pull/108



Comments from TCG member during the RFC/TCG Review Period: 2023-04-21 - 2023-05-18

WG chairs or vice chairs must read the Document, provide comments if any (including on topics not directly linked to the Group matters) or indicate that they have no comment.

IG chairs or vice chairs are also encouraged to do the same, althought their inputs are not compulsory.

TCG Chair & Vice Chair

Applications Working Group

Data Access Layer Working Group

Some minor edits only, otherwise this update looks sound.

  1. Section references would be useful in the v1.1 changes list - PR#105 raised and merged
  2. Some minor grammar updates - PR #107 raised

-- JamesDempsey - 2023-07-03

Data Model Working Group

I have a rather rudimentary understanding of DataLink, VOSI and DALI, so there are some details that I'm glossing over in my read.

I don't see any real issues/conflicts with the DM group work. However, I have 2 points/questions to raise:

  1. local_semantics: This is an identifier from a local vocabulary to help identify/select rows at a finer level than possible with just the other tags (semantics, content_type, content_qualifier). I'm guessing this is for something like ObsCore 's dataproduct_subtype. My question is that I don't really understand what the value is... is it just the tag? or URI for the local vocabulary + tag? The example serializations are no help since the DaCHs ones seem to resolve into a pretty format and I can't see the actual datalink content, and the CADC examples don't populate this field. I'd like to see, either in the document or examples, something more concrete.
  2. Product Type vocabulary: This directly affects the DM group, it'd be used in the Dataset model and ObsCore could be updated to use it as well. The link in the standard resolves to a 2021 version of the vocabulary. At the interop, a 2023 version was discussed which looked like it had some issues. Which vocabulary would support this REC?
    1. 2023-07-28: The referenced vocabulary now resolves to a version dated 2023-06-26 (though the event-list discussion was just going on this week). The elements and definitions in this list appear compatible with DM group usage in the ObsCore and Dataset models. -- MarkCresitelloDittmar - 2023-07-28

-- MarkCresitelloDittmar - 2023-06-21

Grid & Web Services Working Group

Registry Working Group

Semantics Working Group

Data Curation & Preservation Interest Group

Education Interest Group

Knowledge Discovery Interest Group

Operations Interest Group

I don't strictly speaking speak for Operations IG as of this week, but since I did most of the review before my term expired, I'll fill it in here; the TCG can decide whether this counts as an Ops endorsement or not.

As one of the authors I'm basically happy with this document, but I will draw attention to one or two issues.

  • Section 2.2 defines the standardID for this standard as ivo://ivoa.net/std/DataLink#links-1.0, followed by the comment "Note this is applicable to endpoints following any version 1.* of the DataLink standard, to avoid backward compatibility problems." In my opinion the backward compatibility problems are not sufficient to justify this choice, and the minor version should be reflected in this standard ID, i.e. it should be "...#links-1.1". This has been discussed in the open github Issue #96, and other authors seem to agree. A fix will require at least an update to the StandardsRegExt record, and also changes in the document to places where the key is referenced, including Section 2.2 and, especially, Section 3.3.1 as well as related example text. This change would amongst other things make it possible for validators to check which minor version they are supposed to be validating against. PatDowler has volunteered to write a Pull Request addressing this issue, but I can have a go if he doesn't.
  • Section 3.3.1 REQUIRES an INFO defining a suitable standardID for links response tables. The example shows such an INFO element as a child of the RESOURCE/@type="results" element, but it's not clear what restrictions there are on the location - does it have to go there, or can it be elsewhere in the VOTable? This should be clarified. If it's not required to be a child of the results resource, the example text in this section should probably be cut down.
  • Section 3.2.2: The final sentence says "No other additional parameters or client handling are allowed." I don't understand what is meant here. Should this sentence be removed?
  • As mentioned in Issue #82, the recommended MIME type application/x-votable+xml;content=datalink is using a content-type parameter for VOTable not endorsed in the VOTable standard, which is a bit questionable; but this is not new in this version of DataLink and it will hopefully be addressed in VOTable 1.5, so turning a blind eye is probably OK.

-- MarkTaylor - 2023-05-15

Radio Astronomy Interest Group

Solar System Interest Group

Theory Interest Group

Time Domain Interest Group

Standards and Processes Committee


TCG Vote : 2023-05-19 - 2023-06-01

If you have minor comments (typos) on the last version of the document please indicate it in the Comments column of the table and post them in the TCG comments section above with the date.

Group Yes No Abstain Comments
TCG        
Apps        
DAL *      
DM        
GWS        
Registry        
Semantics        
DCP        
Edu        
KDIG        
Ops        
Radio        
SSIG        
Theory        
TD        
<nop>StdProc        



<!--
* Set ALLOWTOPICRENAME = TWikiAdminGroup
-->

META FILEATTACHMENT attachment="AladinDesktopDataLink1-1.png" attr="" comment="" date="1681920946" name="AladinDesktopDataLink1-1.png" path="AladinDesktopDataLink1-1.png" size="1238442" user="FrancoisBonnarel" version="1"

Revision 162023-08-14 - FrancoisBonnarel

 
META TOPICPARENT name="IvoaTCG"

DataLink 1.1 Proposed Recommendation: Request for Comments


Introduction


DataLink describes the linking of data discovery metadata to access to the data itself, further detailed metadata, related resources, and to services that perform operations on the data.

The main changes in v1.1 are

  • Generalize by adding use cases for links to content other than data files
  • VOSI-availability and VOSI-capabilities endpoints are now optional
  • Service descriptors can include exampleURL and contentType param(s), as well as DESCRIPTION, name, etc...
  • Added optional link_auth and link_authorized to signal whether authentication is necessary to use the link
  • INFO element with standardID mandatory in {links} response
  • Added content_qualifier FIELD to inform on the nature of the link target
  • Added local_semantics to identify similar links in the same DataLink service for different IDs
  • Mechanisms to recognize {links} endpoints outside ObsCore

Latest version of DataLink can be found at:

The GitHub repository for issues and source can be found at: Detailed discussion towards 1-1 can also be found on this ivoa twiki page ( last update by FrancoisBonnarel - 2023-05-04):

Reference Interoperable Implementations

Server side

GAVO implemented the following changes = content_qualifier and local_semantics, service descriptor additional features such as DESCRIPTION? name, content_type, etc....

As a matter of example, a couple of links response for various obscore/sia/ssa tables in GAVO server

All these are combined with various semantics or content_type values

-- MarkusDemleitner - 2023-07-10


CADC has implemented the following in https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops:

  • INFO element with standardID in links response
  • new optional fields in links response: local_semantics (no content yet but can be populated with default vocab in most cases), content_qualifier (no content, not likely to use), link_auth, link_authorized
  • contentType param in service descriptors (where applicable)
IRIS image: https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/IRIS?f212h000/IRAS-25um shows link_auth=optional and link_authorized=true because one can authenticate but the data is public.

new CFHT data: anonymous use of https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/CFHT?2773629/2773629o shows link_auth=optional and link_authorized=false because the data is still proprietary and the caller is anonymous; if an authorized user makes the call they will see authorized=true. It's hard to demonstrate that for a general audience.

The core CADC implementation is available as a library (cadc-datalink-server) in MavenCentral with source code at https://github.com/opencadc/dal.git; the caom2-specific logic is available in a library (caom2-datalink-server) with source at https://github.com/opencadc/caom2service.git -- the core lib is also used in ALMA DataLink service but may not yet be released with the latest features.

Client side

TOPCAT v4.8-8 and later displays additional features in service descriptor and makes use of additional links table FIELD such as content_qualifier, authorization and local_semantics; The tool behavior is adapted to the content of these new FIELDS. For example Activation Actions suggested by topcat for the links not only depend on content_type but also content_qualifier, and local_semantics is used to guess which link a user is interested in based on previous selections.

AladinDesktop is going to adapt to those new features too. (see prototype screenshot)

The CADC DownloadManager (https://github.com/opencadc/apps.git) includes a simple DataLink client class so it can resolve publisherID values into 1..* URLs for download; this code hasn't changed as a result of DataLink -1.1. The CADC AdvancedSearch web portal makes calls to the above caom2ops/datalink service to find previews and download info for each row (publisherID): it makes use of link_authorized to decide to display the download options (or not), which prevents users from selecting downloads/links when they are not authorized and the request will be rejected later.

Implementations Validators

The following validators are available for DataLink

  • datalinklint which is part of STILTS. From STILTS v3.4-8 there is a version parameter that can be set to 1.0 or 1.1 (effectively defaults to v1.1)
  • Show my DataLink which is part of DaCHS



Comments from the IVOA Community during RFC/TCG review period: 2023-04-21 - 2023-05-18

The comments from the TCG members during the RFC/TCG review should be included in the next section.

In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your Wiki Name so that authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.

Additional discussion about any of the comments or responses can be conducted on the WG mailing list. However, please be sure to enter your initial comments here for full consideration in any future revisions of this document

Community Comment by Markus Demleitner

(1) The standard ID: I'm pretty sure we discussed that before, but I'm really unsure how we came to the conclusion that even Datalink 1.1 still has the ivoid of ivo://ivoa.net/std/DataLink#links-1.0.

Yes, we have to do crazy stuff like that for the schema URIs due to the way XML element names are compared. But there is in general no analogous need with ivoids, because we control the rules how to compare them in what situations.

Does anyone remember why we went for links-1.0 here? If not, I'd suggest links-1. I volunteer for adding a brief explanation about how clients should disregard the minor version for normal operations.

(2) I am entirely unhappy with section 3.1.1, starting with its title, which probably should be something like "Datalinks in VOTable columns". And then the first paragraph should probably say something more concrete like perhaps "Columns containing datalinks SHOULD be marked with a UCD of X.Y.Z and a LINK-typed child in its FIELD like this:

   <LINK whatever="blabla"/>"

And the second paragraph I'd say doesn't belong here at all (it could go to, perhaps, 1.2.7 or a use case discussing datalinks as primary results if we think we need to be explicit about this).

There are use cases behind this. When datalink links response is hooked to table rows outside the context of ObsTAP/SIA2 how do we generate/recognize the DataLink URL ?

Of course we can use the Service Descriptor with the single ID parameter if the DataLink can be parametrized by and "id" from one of the columns. But in that case the descriptor would be doing exactly the same than the LINK element proposed here as included in the appropriate FIELD and is much less verbose. And it's pretty correct VOTable standard. The FIELD itself should not be described by a datalink ucd because it's probably generally an id.

The second paragraph refers to use cases where the URL is not built from the content of one FIELD and when the URL is ad hoc and should be the content of a FIELD. Using the same utypes than the one used in Obscore responses seems reasonable. This is for example adapted to SIA1 or SSA responses. I think this has nothing to do with recursive datalink.

We may try to rephrase all this if this is unclear, but the intent has to be kept.

FrancoisBonnarel - 2023-08-12

(3) In 3.2, is says:

If an error occurs while processing an ID value, there \rfcshould\ be at least one row for that ID value and an error\_message

The way the pyvo datalink client is written, we have to make that an unconditional MUST, or pyvo will keep requesting any failing ID (and frankly I'm unsure how else to implement this given multi-ID and overflows): it will only remove an ID of its list of ids to query if it gets at least one row back for it. Perhaps:

A service MUST return at least one row for each ID passed in.

[ceterum censeo we should have let ID be single-valued; it would have made everything soo much simpler and nothing really much harder/slower]

(4) 3.2.2, second paragraph: I had to puzzle quite a bit about this, starting with wondering what a "dereferenceable URL" might be. I'd suggest to replace the entire paragraph with "Access URLs may have fragment parts, which could, for instance, refer to id-ed elements within XML documents or extensions within FITS files. As in URIs in general, the interpretation of a fragment identifier depends on the media type."

Added:
>
>
"dereferanceable" was used in the sense that it can be fully accessed by http. Which is not the case in URN in general or URL with fragments. For the latter the client is supposed to interpret the fragment. See: https://en.wikipedia.org/wiki/URI_fragment

apart from that I agree with your rephrasing. FrancoisBonnarel - 2023-08-14"

 This will also drop the "No other additional parameters or client handling are allowed." -- if this forbids query strings on access URIs, I'd strongly disagree. If this means something else, we'd have to write that something else.
Added:
>
>
"The access_url column contains a URL to download a single resource. The URL in this column must be usable as-is with no additional parameters or client handling; it can be a link to a dynamic resource (e.g. preview generation)."
  (5) In my editoral PR, I've dropped a paragraph on semantics for error_message rows. This is now sufficiently addressed above that passage.

(6) sect. 3.2.9 content_qualifier: I think we should at least name the motivating use case a bit more precisely here, as in, perhaps: "It aids clients in presenting to the user the same sort of link as they go from one dataset to another within a service. For instance, suppose a service serves both continuum and line cubes. Using content_qualifier, users can configure their clients such that, as they change to a new data set, they always see the line cube even when the semantics and content\_type columns agree for both types of data." Or so.

(7) Sect. 4.8: Sorry, you cannot introduce a utype ("adhoc:this") in a section called "Example: X". If you are really, really sure these "self-describing" things are useful, put them into a section of their own.

Me, I've frankly never really understood where you want to go with this, and I think there's no implementation doing any of this, so perhaps we should drop the whole thing. But if we don't drop it and somehow nonchalantly mention it in an example, at least don't introduce a new utype here. What's wrong with the name="this" you had before? You see, having two different mechanisms for what to my knowledge hasn't been implemented even once seems a bit excessive.

When dropping adhoc:this, don't forget that it is referenced in sect 4.1.

(8) I have not looked at the DataLinkImp source that's also present in the repo. If you think this ought to become a document, please extract it to a different repo; ivoatex is not designed to support two documents in one repo.

I've also collected a few rather editorial changes in https://github.com/ivoa-std/DataLink/pull/108



Comments from TCG member during the RFC/TCG Review Period: 2023-04-21 - 2023-05-18

WG chairs or vice chairs must read the Document, provide comments if any (including on topics not directly linked to the Group matters) or indicate that they have no comment.

IG chairs or vice chairs are also encouraged to do the same, althought their inputs are not compulsory.

TCG Chair & Vice Chair

Applications Working Group

Data Access Layer Working Group

Some minor edits only, otherwise this update looks sound.

  1. Section references would be useful in the v1.1 changes list - PR#105 raised and merged
  2. Some minor grammar updates - PR #107 raised

-- JamesDempsey - 2023-07-03

Data Model Working Group

I have a rather rudimentary understanding of DataLink, VOSI and DALI, so there are some details that I'm glossing over in my read.

I don't see any real issues/conflicts with the DM group work. However, I have 2 points/questions to raise:

  1. local_semantics: This is an identifier from a local vocabulary to help identify/select rows at a finer level than possible with just the other tags (semantics, content_type, content_qualifier). I'm guessing this is for something like ObsCore 's dataproduct_subtype. My question is that I don't really understand what the value is... is it just the tag? or URI for the local vocabulary + tag? The example serializations are no help since the DaCHs ones seem to resolve into a pretty format and I can't see the actual datalink content, and the CADC examples don't populate this field. I'd like to see, either in the document or examples, something more concrete.
  2. Product Type vocabulary: This directly affects the DM group, it'd be used in the Dataset model and ObsCore could be updated to use it as well. The link in the standard resolves to a 2021 version of the vocabulary. At the interop, a 2023 version was discussed which looked like it had some issues. Which vocabulary would support this REC?
    1. 2023-07-28: The referenced vocabulary now resolves to a version dated 2023-06-26 (though the event-list discussion was just going on this week). The elements and definitions in this list appear compatible with DM group usage in the ObsCore and Dataset models. -- MarkCresitelloDittmar - 2023-07-28

-- MarkCresitelloDittmar - 2023-06-21

Grid & Web Services Working Group

Registry Working Group

Semantics Working Group

Data Curation & Preservation Interest Group

Education Interest Group

Knowledge Discovery Interest Group

Operations Interest Group

I don't strictly speaking speak for Operations IG as of this week, but since I did most of the review before my term expired, I'll fill it in here; the TCG can decide whether this counts as an Ops endorsement or not.

As one of the authors I'm basically happy with this document, but I will draw attention to one or two issues.

  • Section 2.2 defines the standardID for this standard as ivo://ivoa.net/std/DataLink#links-1.0, followed by the comment "Note this is applicable to endpoints following any version 1.* of the DataLink standard, to avoid backward compatibility problems." In my opinion the backward compatibility problems are not sufficient to justify this choice, and the minor version should be reflected in this standard ID, i.e. it should be "...#links-1.1". This has been discussed in the open github Issue #96, and other authors seem to agree. A fix will require at least an update to the StandardsRegExt record, and also changes in the document to places where the key is referenced, including Section 2.2 and, especially, Section 3.3.1 as well as related example text. This change would amongst other things make it possible for validators to check which minor version they are supposed to be validating against. PatDowler has volunteered to write a Pull Request addressing this issue, but I can have a go if he doesn't.
  • Section 3.3.1 REQUIRES an INFO defining a suitable standardID for links response tables. The example shows such an INFO element as a child of the RESOURCE/@type="results" element, but it's not clear what restrictions there are on the location - does it have to go there, or can it be elsewhere in the VOTable? This should be clarified. If it's not required to be a child of the results resource, the example text in this section should probably be cut down.
  • Section 3.2.2: The final sentence says "No other additional parameters or client handling are allowed." I don't understand what is meant here. Should this sentence be removed?
  • As mentioned in Issue #82, the recommended MIME type application/x-votable+xml;content=datalink is using a content-type parameter for VOTable not endorsed in the VOTable standard, which is a bit questionable; but this is not new in this version of DataLink and it will hopefully be addressed in VOTable 1.5, so turning a blind eye is probably OK.

-- MarkTaylor - 2023-05-15

Radio Astronomy Interest Group

Solar System Interest Group

Theory Interest Group

Time Domain Interest Group

Standards and Processes Committee


TCG Vote : 2023-05-19 - 2023-06-01

If you have minor comments (typos) on the last version of the document please indicate it in the Comments column of the table and post them in the TCG comments section above with the date.

Group Yes No Abstain Comments
TCG        
Apps        
DAL *      
DM        
GWS        
Registry        
Semantics        
DCP        
Edu        
KDIG        
Ops        
Radio        
SSIG        
Theory        
TD        
<nop>StdProc        



<!--
* Set ALLOWTOPICRENAME = TWikiAdminGroup
-->

META FILEATTACHMENT attachment="AladinDesktopDataLink1-1.png" attr="" comment="" date="1681920946" name="AladinDesktopDataLink1-1.png" path="AladinDesktopDataLink1-1.png" size="1238442" user="FrancoisBonnarel" version="1"

Revision 152023-08-12 - FrancoisBonnarel

 
META TOPICPARENT name="IvoaTCG"

DataLink 1.1 Proposed Recommendation: Request for Comments


Introduction


DataLink describes the linking of data discovery metadata to access to the data itself, further detailed metadata, related resources, and to services that perform operations on the data.

The main changes in v1.1 are

  • Generalize by adding use cases for links to content other than data files
  • VOSI-availability and VOSI-capabilities endpoints are now optional
  • Service descriptors can include exampleURL and contentType param(s), as well as DESCRIPTION, name, etc...
  • Added optional link_auth and link_authorized to signal whether authentication is necessary to use the link
  • INFO element with standardID mandatory in {links} response
  • Added content_qualifier FIELD to inform on the nature of the link target
  • Added local_semantics to identify similar links in the same DataLink service for different IDs
  • Mechanisms to recognize {links} endpoints outside ObsCore

Latest version of DataLink can be found at:

The GitHub repository for issues and source can be found at: Detailed discussion towards 1-1 can also be found on this ivoa twiki page ( last update by FrancoisBonnarel - 2023-05-04):

Reference Interoperable Implementations

Server side

GAVO implemented the following changes = content_qualifier and local_semantics, service descriptor additional features such as DESCRIPTION? name, content_type, etc....

As a matter of example, a couple of links response for various obscore/sia/ssa tables in GAVO server

All these are combined with various semantics or content_type values

-- MarkusDemleitner - 2023-07-10


CADC has implemented the following in https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops:

  • INFO element with standardID in links response
  • new optional fields in links response: local_semantics (no content yet but can be populated with default vocab in most cases), content_qualifier (no content, not likely to use), link_auth, link_authorized
  • contentType param in service descriptors (where applicable)
IRIS image: https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/IRIS?f212h000/IRAS-25um shows link_auth=optional and link_authorized=true because one can authenticate but the data is public.

new CFHT data: anonymous use of https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/CFHT?2773629/2773629o shows link_auth=optional and link_authorized=false because the data is still proprietary and the caller is anonymous; if an authorized user makes the call they will see authorized=true. It's hard to demonstrate that for a general audience.

The core CADC implementation is available as a library (cadc-datalink-server) in MavenCentral with source code at https://github.com/opencadc/dal.git; the caom2-specific logic is available in a library (caom2-datalink-server) with source at https://github.com/opencadc/caom2service.git -- the core lib is also used in ALMA DataLink service but may not yet be released with the latest features.

Client side

TOPCAT v4.8-8 and later displays additional features in service descriptor and makes use of additional links table FIELD such as content_qualifier, authorization and local_semantics; The tool behavior is adapted to the content of these new FIELDS. For example Activation Actions suggested by topcat for the links not only depend on content_type but also content_qualifier, and local_semantics is used to guess which link a user is interested in based on previous selections.

AladinDesktop is going to adapt to those new features too. (see prototype screenshot)

The CADC DownloadManager (https://github.com/opencadc/apps.git) includes a simple DataLink client class so it can resolve publisherID values into 1..* URLs for download; this code hasn't changed as a result of DataLink -1.1. The CADC AdvancedSearch web portal makes calls to the above caom2ops/datalink service to find previews and download info for each row (publisherID): it makes use of link_authorized to decide to display the download options (or not), which prevents users from selecting downloads/links when they are not authorized and the request will be rejected later.

Implementations Validators

The following validators are available for DataLink

  • datalinklint which is part of STILTS. From STILTS v3.4-8 there is a version parameter that can be set to 1.0 or 1.1 (effectively defaults to v1.1)
  • Show my DataLink which is part of DaCHS



Comments from the IVOA Community during RFC/TCG review period: 2023-04-21 - 2023-05-18

The comments from the TCG members during the RFC/TCG review should be included in the next section.

In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your Wiki Name so that authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.

Additional discussion about any of the comments or responses can be conducted on the WG mailing list. However, please be sure to enter your initial comments here for full consideration in any future revisions of this document

Community Comment by Markus Demleitner

(1) The standard ID: I'm pretty sure we discussed that before, but I'm really unsure how we came to the conclusion that even Datalink 1.1 still has the ivoid of ivo://ivoa.net/std/DataLink#links-1.0.

Yes, we have to do crazy stuff like that for the schema URIs due to the way XML element names are compared. But there is in general no analogous need with ivoids, because we control the rules how to compare them in what situations.

Does anyone remember why we went for links-1.0 here? If not, I'd suggest links-1. I volunteer for adding a brief explanation about how clients should disregard the minor version for normal operations.

Changed:
<
<
(2) I am entriely unhappy with section 3.1.1, starting with its title,
>
>
(2) I am entirely unhappy with section 3.1.1, starting with its title,
 which probably should be something like "Datalinks in VOTable columns". And then the first paragraph should probably say something more concrete like perhaps "Columns containing datalinks SHOULD be marked with a UCD of X.Y.Z and a LINK-typed child in its FIELD like this:

   <LINK whatever="blabla"/>"

And the second paragraph I'd say doesn't belong here at all (it could go to, perhaps, 1.2.7 or a use case discussing datalinks as primary results

Changed:
<
<
if we think we need to be explicit about this).
>
>
if we think we need to be explicit about this).
 
Added:
>
>
There are use cases behind this. When datalink links response is hooked to table rows outside the context of ObsTAP/SIA2 how do we generate/recognize the DataLink URL ?

Of course we can use the Service Descriptor with the single ID parameter if the DataLink can be parametrized by and "id" from one of the columns. But in that case the descriptor would be doing exactly the same than the LINK element proposed here as included in the appropriate FIELD and is much less verbose. And it's pretty correct VOTable standard. The FIELD itself should not be described by a datalink ucd because it's probably generally an id.

The second paragraph refers to use cases where the URL is not built from the content of one FIELD and when the URL is ad hoc and should be the content of a FIELD. Using the same utypes than the one used in Obscore responses seems reasonable. This is for example adapted to SIA1 or SSA responses. I think this has nothing to do with recursive datalink.

We may try to rephrase all this if this is unclear, but the intent has to be kept.

FrancoisBonnarel - 2023-08-12

 (3) In 3.2, is says:

If an error occurs while processing an ID value, there \rfcshould\ be at least one row for that ID value and an error\_message
Added:
>
>
 

The way the pyvo datalink client is written, we have to make that an unconditional MUST, or pyvo will keep requesting any failing ID (and frankly I'm unsure how else to implement this given multi-ID and overflows): it will only remove an ID of its list of ids to query if it gets at least one row back for it. Perhaps:

A service MUST return at least one row for each ID passed in.

[ceterum censeo we should have let ID be single-valued; it would have made everything soo much simpler and nothing really much harder/slower]

(4) 3.2.2, second paragraph: I had to puzzle quite a bit about this, starting with wondering what a "dereferenceable URL" might be. I'd suggest to replace the entire paragraph with "Access URLs may have fragment parts, which could, for instance, refer to id-ed elements within XML documents or extensions within FITS files. As in URIs in general, the interpretation of a fragment identifier depends on the media type."

This will also drop the "No other additional parameters or client handling are allowed." -- if this forbids query strings on access URIs, I'd strongly disagree. If this means something else, we'd have to write that something else.

(5) In my editoral PR, I've dropped a paragraph on semantics for error_message rows. This is now sufficiently addressed above that passage.

(6) sect. 3.2.9 content_qualifier: I think we should at least name the motivating use case a bit more precisely here, as in, perhaps: "It aids clients in presenting to the user the same sort of link as they go from one dataset to another within a service. For instance, suppose a service serves both continuum and line cubes. Using content_qualifier, users can configure their clients such that, as they change to a new data set, they always see the line cube even when the semantics and content\_type columns agree for both types of data." Or so.

(7) Sect. 4.8: Sorry, you cannot introduce a utype ("adhoc:this") in a section called "Example: X". If you are really, really sure these "self-describing" things are useful, put them into a section of their own.

Me, I've frankly never really understood where you want to go with this, and I think there's no implementation doing any of this, so perhaps we should drop the whole thing. But if we don't drop it and somehow nonchalantly mention it in an example, at least don't introduce a new utype here. What's wrong with the name="this" you had before? You see, having two different mechanisms for what to my knowledge hasn't been implemented even once seems a bit excessive.

When dropping adhoc:this, don't forget that it is referenced in sect 4.1.

(8) I have not looked at the DataLinkImp source that's also present in the repo. If you think this ought to become a document, please extract it to a different repo; ivoatex is not designed to support two documents in one repo.

I've also collected a few rather editorial changes in https://github.com/ivoa-std/DataLink/pull/108



Comments from TCG member during the RFC/TCG Review Period: 2023-04-21 - 2023-05-18

WG chairs or vice chairs must read the Document, provide comments if any (including on topics not directly linked to the Group matters) or indicate that they have no comment.

IG chairs or vice chairs are also encouraged to do the same, althought their inputs are not compulsory.

TCG Chair & Vice Chair

Applications Working Group

Data Access Layer Working Group

Some minor edits only, otherwise this update looks sound.

  1. Section references would be useful in the v1.1 changes list - PR#105 raised and merged
  2. Some minor grammar updates - PR #107 raised

-- JamesDempsey - 2023-07-03

Data Model Working Group

I have a rather rudimentary understanding of DataLink, VOSI and DALI, so there are some details that I'm glossing over in my read.

I don't see any real issues/conflicts with the DM group work. However, I have 2 points/questions to raise:

  1. local_semantics: This is an identifier from a local vocabulary to help identify/select rows at a finer level than possible with just the other tags (semantics, content_type, content_qualifier). I'm guessing this is for something like ObsCore 's dataproduct_subtype. My question is that I don't really understand what the value is... is it just the tag? or URI for the local vocabulary + tag? The example serializations are no help since the DaCHs ones seem to resolve into a pretty format and I can't see the actual datalink content, and the CADC examples don't populate this field. I'd like to see, either in the document or examples, something more concrete.
  2. Product Type vocabulary: This directly affects the DM group, it'd be used in the Dataset model and ObsCore could be updated to use it as well. The link in the standard resolves to a 2021 version of the vocabulary. At the interop, a 2023 version was discussed which looked like it had some issues. Which vocabulary would support this REC?
    1. 2023-07-28: The referenced vocabulary now resolves to a version dated 2023-06-26 (though the event-list discussion was just going on this week). The elements and definitions in this list appear compatible with DM group usage in the ObsCore and Dataset models. -- MarkCresitelloDittmar - 2023-07-28

-- MarkCresitelloDittmar - 2023-06-21

Grid & Web Services Working Group

Registry Working Group

Semantics Working Group

Data Curation & Preservation Interest Group

Education Interest Group

Knowledge Discovery Interest Group

Operations Interest Group

I don't strictly speaking speak for Operations IG as of this week, but since I did most of the review before my term expired, I'll fill it in here; the TCG can decide whether this counts as an Ops endorsement or not.

As one of the authors I'm basically happy with this document, but I will draw attention to one or two issues.

  • Section 2.2 defines the standardID for this standard as ivo://ivoa.net/std/DataLink#links-1.0, followed by the comment "Note this is applicable to endpoints following any version 1.* of the DataLink standard, to avoid backward compatibility problems." In my opinion the backward compatibility problems are not sufficient to justify this choice, and the minor version should be reflected in this standard ID, i.e. it should be "...#links-1.1". This has been discussed in the open github Issue #96, and other authors seem to agree. A fix will require at least an update to the StandardsRegExt record, and also changes in the document to places where the key is referenced, including Section 2.2 and, especially, Section 3.3.1 as well as related example text. This change would amongst other things make it possible for validators to check which minor version they are supposed to be validating against. PatDowler has volunteered to write a Pull Request addressing this issue, but I can have a go if he doesn't.
  • Section 3.3.1 REQUIRES an INFO defining a suitable standardID for links response tables. The example shows such an INFO element as a child of the RESOURCE/@type="results" element, but it's not clear what restrictions there are on the location - does it have to go there, or can it be elsewhere in the VOTable? This should be clarified. If it's not required to be a child of the results resource, the example text in this section should probably be cut down.
  • Section 3.2.2: The final sentence says "No other additional parameters or client handling are allowed." I don't understand what is meant here. Should this sentence be removed?
  • As mentioned in Issue #82, the recommended MIME type application/x-votable+xml;content=datalink is using a content-type parameter for VOTable not endorsed in the VOTable standard, which is a bit questionable; but this is not new in this version of DataLink and it will hopefully be addressed in VOTable 1.5, so turning a blind eye is probably OK.

-- MarkTaylor - 2023-05-15

Radio Astronomy Interest Group

Solar System Interest Group

Theory Interest Group

Time Domain Interest Group

Standards and Processes Committee


TCG Vote : 2023-05-19 - 2023-06-01

If you have minor comments (typos) on the last version of the document please indicate it in the Comments column of the table and post them in the TCG comments section above with the date.

Group Yes No Abstain Comments
TCG        
Apps        
DAL *      
DM        
GWS        
Registry        
Semantics        
DCP        
Edu        
KDIG        
Ops        
Radio        
SSIG        
Theory        
TD        
<nop>StdProc        



<!--
* Set ALLOWTOPICRENAME = TWikiAdminGroup
-->

META FILEATTACHMENT attachment="AladinDesktopDataLink1-1.png" attr="" comment="" date="1681920946" name="AladinDesktopDataLink1-1.png" path="AladinDesktopDataLink1-1.png" size="1238442" user="FrancoisBonnarel" version="1"

Revision 142023-07-28 - MarkCresitelloDittmar

 
META TOPICPARENT name="IvoaTCG"

DataLink 1.1 Proposed Recommendation: Request for Comments


Introduction


DataLink describes the linking of data discovery metadata to access to the data itself, further detailed metadata, related resources, and to services that perform operations on the data.

The main changes in v1.1 are

  • Generalize by adding use cases for links to content other than data files
  • VOSI-availability and VOSI-capabilities endpoints are now optional
  • Service descriptors can include exampleURL and contentType param(s), as well as DESCRIPTION, name, etc...
  • Added optional link_auth and link_authorized to signal whether authentication is necessary to use the link
  • INFO element with standardID mandatory in {links} response
  • Added content_qualifier FIELD to inform on the nature of the link target
  • Added local_semantics to identify similar links in the same DataLink service for different IDs
  • Mechanisms to recognize {links} endpoints outside ObsCore

Latest version of DataLink can be found at:

The GitHub repository for issues and source can be found at: Detailed discussion towards 1-1 can also be found on this ivoa twiki page ( last update by FrancoisBonnarel - 2023-05-04):
Deleted:
<
<
 

Reference Interoperable Implementations

Server side

GAVO implemented the following changes = content_qualifier and local_semantics, service descriptor additional features such as DESCRIPTION? name, content_type, etc....

As a matter of example, a couple of links response for various obscore/sia/ssa tables in GAVO server

All these are combined with various semantics or content_type values

-- MarkusDemleitner - 2023-07-10


CADC has implemented the following in https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops:

  • INFO element with standardID in links response
  • new optional fields in links response: local_semantics (no content yet but can be populated with default vocab in most cases), content_qualifier (no content, not likely to use), link_auth, link_authorized
  • contentType param in service descriptors (where applicable)
IRIS image: https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/IRIS?f212h000/IRAS-25um shows link_auth=optional and link_authorized=true because one can authenticate but the data is public.

new CFHT data: anonymous use of https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/CFHT?2773629/2773629o shows link_auth=optional and link_authorized=false because the data is still proprietary and the caller is anonymous; if an authorized user makes the call they will see authorized=true. It's hard to demonstrate that for a general audience.

The core CADC implementation is available as a library (cadc-datalink-server) in MavenCentral with source code at https://github.com/opencadc/dal.git; the caom2-specific logic is available in a library (caom2-datalink-server) with source at https://github.com/opencadc/caom2service.git -- the core lib is also used in ALMA DataLink service but may not yet be released with the latest features.

Client side

TOPCAT v4.8-8 and later displays additional features in service descriptor and makes use of additional links table FIELD such as content_qualifier, authorization and local_semantics; The tool behavior is adapted to the content of these new FIELDS. For example Activation Actions suggested by topcat for the links not only depend on content_type but also content_qualifier, and local_semantics is used to guess which link a user is interested in based on previous selections.

AladinDesktop is going to adapt to those new features too. (see prototype screenshot)

The CADC DownloadManager (https://github.com/opencadc/apps.git) includes a simple DataLink client class so it can resolve publisherID values into 1..* URLs for download; this code hasn't changed as a result of DataLink -1.1. The CADC AdvancedSearch web portal makes calls to the above caom2ops/datalink service to find previews and download info for each row (publisherID): it makes use of link_authorized to decide to display the download options (or not), which prevents users from selecting downloads/links when they are not authorized and the request will be rejected later.

Implementations Validators

The following validators are available for DataLink

  • datalinklint which is part of STILTS. From STILTS v3.4-8 there is a version parameter that can be set to 1.0 or 1.1 (effectively defaults to v1.1)
  • Show my DataLink which is part of DaCHS



Comments from the IVOA Community during RFC/TCG review period: 2023-04-21 - 2023-05-18

The comments from the TCG members during the RFC/TCG review should be included in the next section.

In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your Wiki Name so that authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.

Additional discussion about any of the comments or responses can be conducted on the WG mailing list. However, please be sure to enter your initial comments here for full consideration in any future revisions of this document

Community Comment by Markus Demleitner

(1) The standard ID: I'm pretty sure we discussed that before, but I'm really unsure how we came to the conclusion that even Datalink 1.1 still has the ivoid of ivo://ivoa.net/std/DataLink#links-1.0.

Yes, we have to do crazy stuff like that for the schema URIs due to the way XML element names are compared. But there is in general no analogous need with ivoids, because we control the rules how to compare them in what situations.

Does anyone remember why we went for links-1.0 here? If not, I'd suggest links-1. I volunteer for adding a brief explanation about how clients should disregard the minor version for normal operations.

(2) I am entriely unhappy with section 3.1.1, starting with its title, which probably should be something like "Datalinks in VOTable columns". And then the first paragraph should probably say something more concrete like perhaps "Columns containing datalinks SHOULD be marked with a UCD of X.Y.Z and a LINK-typed child in its FIELD like this:

Added:
>
>
 
   <LINK whatever="blabla"/>"

And the second paragraph I'd say doesn't belong here at all (it could go to, perhaps, 1.2.7 or a use case discussing datalinks as primary results if we think we need to be explicit about this).

(3) In 3.2, is says:

If an error occurs while processing an ID value, there \rfcshould\ be at least one row for that ID value and an error\_message

The way the pyvo datalink client is written, we have to make that an unconditional MUST, or pyvo will keep requesting any failing ID (and frankly I'm unsure how else to implement this given multi-ID and overflows): it will only remove an ID of its list of ids to query if it gets at least one row back for it. Perhaps:

A service MUST return at least one row for each ID passed in.

[ceterum censeo we should have let ID be single-valued; it would have made everything soo much simpler and nothing really much harder/slower]

(4) 3.2.2, second paragraph: I had to puzzle quite a bit about this, starting with wondering what a "dereferenceable URL" might be. I'd suggest to replace the entire paragraph with "Access URLs may have fragment parts, which could, for instance, refer to id-ed elements within XML documents or extensions within FITS files. As in URIs in general, the interpretation of a fragment identifier depends on the media type."

This will also drop the "No other additional parameters or client handling are allowed." -- if this forbids query strings on access URIs, I'd strongly disagree. If this means something else, we'd have to write that something else.

(5) In my editoral PR, I've dropped a paragraph on semantics for error_message rows. This is now sufficiently addressed above that passage.

(6) sect. 3.2.9 content_qualifier: I think we should at least name the motivating use case a bit more precisely here, as in, perhaps: "It aids clients in presenting to the user the same sort of link as they go from one dataset to another within a service. For instance, suppose a service serves both continuum and line cubes. Using content_qualifier, users can configure their clients such that, as they change to a new data set, they always see the line cube even when the semantics and content\_type columns agree for both types of data." Or so.

(7) Sect. 4.8: Sorry, you cannot introduce a utype ("adhoc:this") in a section called "Example: X". If you are really, really sure these "self-describing" things are useful, put them into a section of their own.

Me, I've frankly never really understood where you want to go with this, and I think there's no implementation doing any of this, so perhaps we should drop the whole thing. But if we don't drop it and somehow nonchalantly mention it in an example, at least don't introduce a new utype here. What's wrong with the name="this" you had before? You see, having two different mechanisms for what to my knowledge hasn't been implemented even once seems a bit excessive.

When dropping adhoc:this, don't forget that it is referenced in sect 4.1.

(8) I have not looked at the DataLinkImp source that's also present in the repo. If you think this ought to become a document, please extract it to a different repo; ivoatex is not designed to support two documents in one repo.

I've also collected a few rather editorial changes in https://github.com/ivoa-std/DataLink/pull/108



Comments from TCG member during the RFC/TCG Review Period: 2023-04-21 - 2023-05-18

WG chairs or vice chairs must read the Document, provide comments if any (including on topics not directly linked to the Group matters) or indicate that they have no comment.

IG chairs or vice chairs are also encouraged to do the same, althought their inputs are not compulsory.

TCG Chair & Vice Chair

Applications Working Group

Data Access Layer Working Group

Some minor edits only, otherwise this update looks sound.

  1. Section references would be useful in the v1.1 changes list - PR#105 raised and merged
  2. Some minor grammar updates - PR #107 raised

-- JamesDempsey - 2023-07-03

Data Model Working Group

I have a rather rudimentary understanding of DataLink, VOSI and DALI, so there are some details that I'm glossing over in my read.

I don't see any real issues/conflicts with the DM group work. However, I have 2 points/questions to raise:

  1. local_semantics: This is an identifier from a local vocabulary to help identify/select rows at a finer level than possible with just the other tags (semantics, content_type, content_qualifier). I'm guessing this is for something like ObsCore 's dataproduct_subtype. My question is that I don't really understand what the value is... is it just the tag? or URI for the local vocabulary + tag? The example serializations are no help since the DaCHs ones seem to resolve into a pretty format and I can't see the actual datalink content, and the CADC examples don't populate this field. I'd like to see, either in the document or examples, something more concrete.
  2. Product Type vocabulary: This directly affects the DM group, it'd be used in the Dataset model and ObsCore could be updated to use it as well. The link in the standard resolves to a 2021 version of the vocabulary. At the interop, a 2023 version was discussed which looked like it had some issues. Which vocabulary would support this REC?
Added:
>
>
    1. 2023-07-28: The referenced vocabulary now resolves to a version dated 2023-06-26 (though the event-list discussion was just going on this week). The elements and definitions in this list appear compatible with DM group usage in the ObsCore and Dataset models. -- MarkCresitelloDittmar - 2023-07-28
  -- MarkCresitelloDittmar - 2023-06-21

Grid & Web Services Working Group

Registry Working Group

Semantics Working Group

Data Curation & Preservation Interest Group

Education Interest Group

Knowledge Discovery Interest Group

Operations Interest Group

I don't strictly speaking speak for Operations IG as of this week, but since I did most of the review before my term expired, I'll fill it in here; the TCG can decide whether this counts as an Ops endorsement or not.

As one of the authors I'm basically happy with this document, but I will draw attention to one or two issues.

  • Section 2.2 defines the standardID for this standard as ivo://ivoa.net/std/DataLink#links-1.0, followed by the comment "Note this is applicable to endpoints following any version 1.* of the DataLink standard, to avoid backward compatibility problems." In my opinion the backward compatibility problems are not sufficient to justify this choice, and the minor version should be reflected in this standard ID, i.e. it should be "...#links-1.1". This has been discussed in the open github Issue #96, and other authors seem to agree. A fix will require at least an update to the StandardsRegExt record, and also changes in the document to places where the key is referenced, including Section 2.2 and, especially, Section 3.3.1 as well as related example text. This change would amongst other things make it possible for validators to check which minor version they are supposed to be validating against. PatDowler has volunteered to write a Pull Request addressing this issue, but I can have a go if he doesn't.
  • Section 3.3.1 REQUIRES an INFO defining a suitable standardID for links response tables. The example shows such an INFO element as a child of the RESOURCE/@type="results" element, but it's not clear what restrictions there are on the location - does it have to go there, or can it be elsewhere in the VOTable? This should be clarified. If it's not required to be a child of the results resource, the example text in this section should probably be cut down.
  • Section 3.2.2: The final sentence says "No other additional parameters or client handling are allowed." I don't understand what is meant here. Should this sentence be removed?
  • As mentioned in Issue #82, the recommended MIME type application/x-votable+xml;content=datalink is using a content-type parameter for VOTable not endorsed in the VOTable standard, which is a bit questionable; but this is not new in this version of DataLink and it will hopefully be addressed in VOTable 1.5, so turning a blind eye is probably OK.

-- MarkTaylor - 2023-05-15

Radio Astronomy Interest Group

Solar System Interest Group

Theory Interest Group

Time Domain Interest Group

Standards and Processes Committee


TCG Vote : 2023-05-19 - 2023-06-01

If you have minor comments (typos) on the last version of the document please indicate it in the Comments column of the table and post them in the TCG comments section above with the date.

Group Yes No Abstain Comments
TCG        
Apps        
DAL *      
DM        
GWS        
Registry        
Semantics        
DCP        
Edu        
KDIG        
Ops        
Radio        
SSIG        
Theory        
TD        
<nop>StdProc        



<!--
* Set ALLOWTOPICRENAME = TWikiAdminGroup
-->

META FILEATTACHMENT attachment="AladinDesktopDataLink1-1.png" attr="" comment="" date="1681920946" name="AladinDesktopDataLink1-1.png" path="AladinDesktopDataLink1-1.png" size="1238442" user="FrancoisBonnarel" version="1"

Revision 132023-07-28 - MarkusDemleitner

 
META TOPICPARENT name="IvoaTCG"

DataLink 1.1 Proposed Recommendation: Request for Comments


Introduction


DataLink describes the linking of data discovery metadata to access to the data itself, further detailed metadata, related resources, and to services that perform operations on the data.

The main changes in v1.1 are

  • Generalize by adding use cases for links to content other than data files
  • VOSI-availability and VOSI-capabilities endpoints are now optional
  • Service descriptors can include exampleURL and contentType param(s), as well as DESCRIPTION, name, etc...
  • Added optional link_auth and link_authorized to signal whether authentication is necessary to use the link
  • INFO element with standardID mandatory in {links} response
  • Added content_qualifier FIELD to inform on the nature of the link target
  • Added local_semantics to identify similar links in the same DataLink service for different IDs
  • Mechanisms to recognize {links} endpoints outside ObsCore

Latest version of DataLink can be found at:

The GitHub repository for issues and source can be found at: Detailed discussion towards 1-1 can also be found on this ivoa twiki page ( last update by FrancoisBonnarel - 2023-05-04):
Deleted:
<
<
-- MarkusDemleitner - 2023-07-10
 

Reference Interoperable Implementations

Server side

GAVO implemented the following changes = content_qualifier and local_semantics, service descriptor additional features such as DESCRIPTION? name, content_type, etc....

As a matter of example, a couple of links response for various obscore/sia/ssa tables in GAVO server

All these are combined with various semantics or content_type values

Added:
>
>
-- MarkusDemleitner - 2023-07-10
 

CADC has implemented the following in https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops:

  • INFO element with standardID in links response
  • new optional fields in links response: local_semantics (no content yet but can be populated with default vocab in most cases), content_qualifier (no content, not likely to use), link_auth, link_authorized
  • contentType param in service descriptors (where applicable)
IRIS image: https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/IRIS?f212h000/IRAS-25um shows link_auth=optional and link_authorized=true because one can authenticate but the data is public.

new CFHT data: anonymous use of https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/CFHT?2773629/2773629o shows link_auth=optional and link_authorized=false because the data is still proprietary and the caller is anonymous; if an authorized user makes the call they will see authorized=true. It's hard to demonstrate that for a general audience.

The core CADC implementation is available as a library (cadc-datalink-server) in MavenCentral with source code at https://github.com/opencadc/dal.git; the caom2-specific logic is available in a library (caom2-datalink-server) with source at https://github.com/opencadc/caom2service.git -- the core lib is also used in ALMA DataLink service but may not yet be released with the latest features.

Client side

TOPCAT v4.8-8 and later displays additional features in service descriptor and makes use of additional links table FIELD such as content_qualifier, authorization and local_semantics; The tool behavior is adapted to the content of these new FIELDS. For example Activation Actions suggested by topcat for the links not only depend on content_type but also content_qualifier, and local_semantics is used to guess which link a user is interested in based on previous selections.

AladinDesktop is going to adapt to those new features too. (see prototype screenshot)

The CADC DownloadManager (https://github.com/opencadc/apps.git) includes a simple DataLink client class so it can resolve publisherID values into 1..* URLs for download; this code hasn't changed as a result of DataLink -1.1. The CADC AdvancedSearch web portal makes calls to the above caom2ops/datalink service to find previews and download info for each row (publisherID): it makes use of link_authorized to decide to display the download options (or not), which prevents users from selecting downloads/links when they are not authorized and the request will be rejected later.

Implementations Validators

The following validators are available for DataLink

  • datalinklint which is part of STILTS. From STILTS v3.4-8 there is a version parameter that can be set to 1.0 or 1.1 (effectively defaults to v1.1)
  • Show my DataLink which is part of DaCHS



Comments from the IVOA Community during RFC/TCG review period: 2023-04-21 - 2023-05-18

The comments from the TCG members during the RFC/TCG review should be included in the next section.

In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your Wiki Name so that authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.

Additional discussion about any of the comments or responses can be conducted on the WG mailing list. However, please be sure to enter your initial comments here for full consideration in any future revisions of this document

Community Comment by Markus Demleitner

(1) The standard ID: I'm pretty sure we discussed that before, but I'm really unsure how we came to the conclusion that even Datalink 1.1 still has the ivoid of ivo://ivoa.net/std/DataLink#links-1.0.

Yes, we have to do crazy stuff like that for the schema URIs due to the way XML element names are compared. But there is in general no analogous need with ivoids, because we control the rules how to compare them in what situations.

Does anyone remember why we went for links-1.0 here? If not, I'd suggest links-1. I volunteer for adding a brief explanation about how clients should disregard the minor version for normal operations.

(2) I am entriely unhappy with section 3.1.1, starting with its title, which probably should be something like "Datalinks in VOTable columns". And then the first paragraph should probably say something more concrete like perhaps "Columns containing datalinks SHOULD be marked with a UCD of X.Y.Z and a LINK-typed child in its FIELD like this:

Changed:
<
<
>
>
   <LINK whatever="blabla"/>"

Deleted:
<
<
   <LINK whatever="blabla"/>"

 

And the second paragraph I'd say doesn't belong here at all (it could go to, perhaps, 1.2.7 or a use case discussing datalinks as primary results if we think we need to be explicit about this).

(3) In 3.2, is says:

If an error occurs while processing an ID value, there \rfcshould\ be at least one row for that ID value and an error\_message

The way the pyvo datalink client is written, we have to make that an unconditional MUST, or pyvo will keep requesting any failing ID (and frankly I'm unsure how else to implement this given multi-ID and overflows): it will only remove an ID of its list of ids to query if it gets at least one row back for it. Perhaps:

A service MUST return at least one row for each ID passed in.

[ceterum censeo we should have let ID be single-valued; it would have made everything soo much simpler and nothing really much harder/slower]

(4) 3.2.2, second paragraph: I had to puzzle quite a bit about this, starting with wondering what a "dereferenceable URL" might be. I'd suggest to replace the entire paragraph with "Access URLs may have fragment parts, which could, for instance, refer to id-ed elements within XML documents or extensions within FITS files. As in URIs in general, the interpretation of a fragment identifier depends on the media type."

This will also drop the "No other additional parameters or client handling are allowed." -- if this forbids query strings on access URIs, I'd strongly disagree. If this means something else, we'd have to write that something else.

(5) In my editoral PR, I've dropped a paragraph on semantics for error_message rows. This is now sufficiently addressed above that passage.

(6) sect. 3.2.9 content_qualifier: I think we should at least name the motivating use case a bit more precisely here, as in, perhaps: "It aids clients in presenting to the user the same sort of link as they go from one dataset to another within a service. For instance, suppose a service serves both continuum and line cubes. Using content_qualifier, users can configure their clients such that, as they change to a new data set, they always see the line cube even when the semantics and content\_type columns agree for both types of data." Or so.

(7) Sect. 4.8: Sorry, you cannot introduce a utype ("adhoc:this") in a section called "Example: X". If you are really, really sure these "self-describing" things are useful, put them into a section of their own.

Me, I've frankly never really understood where you want to go with this, and I think there's no implementation doing any of this, so perhaps we should drop the whole thing. But if we don't drop it and somehow nonchalantly mention it in an example, at least don't introduce a new utype here. What's wrong with the name="this" you had before? You see, having two different mechanisms for what to my knowledge hasn't been implemented even once seems a bit excessive.

When dropping adhoc:this, don't forget that it is referenced in sect 4.1.

(8) I have not looked at the DataLinkImp source that's also present in the repo. If you think this ought to become a document, please extract it to a different repo; ivoatex is not designed to support two documents in one repo.

I've also collected a few rather editorial changes in https://github.com/ivoa-std/DataLink/pull/108



Comments from TCG member during the RFC/TCG Review Period: 2023-04-21 - 2023-05-18

WG chairs or vice chairs must read the Document, provide comments if any (including on topics not directly linked to the Group matters) or indicate that they have no comment.

IG chairs or vice chairs are also encouraged to do the same, althought their inputs are not compulsory.

TCG Chair & Vice Chair

Applications Working Group

Data Access Layer Working Group

Some minor edits only, otherwise this update looks sound.

  1. Section references would be useful in the v1.1 changes list - PR#105 raised and merged
  2. Some minor grammar updates - PR #107 raised

-- JamesDempsey - 2023-07-03

Data Model Working Group

I have a rather rudimentary understanding of DataLink, VOSI and DALI, so there are some details that I'm glossing over in my read.

I don't see any real issues/conflicts with the DM group work. However, I have 2 points/questions to raise:

  1. local_semantics: This is an identifier from a local vocabulary to help identify/select rows at a finer level than possible with just the other tags (semantics, content_type, content_qualifier). I'm guessing this is for something like ObsCore 's dataproduct_subtype. My question is that I don't really understand what the value is... is it just the tag? or URI for the local vocabulary + tag? The example serializations are no help since the DaCHs ones seem to resolve into a pretty format and I can't see the actual datalink content, and the CADC examples don't populate this field. I'd like to see, either in the document or examples, something more concrete.
  2. Product Type vocabulary: This directly affects the DM group, it'd be used in the Dataset model and ObsCore could be updated to use it as well. The link in the standard resolves to a 2021 version of the vocabulary. At the interop, a 2023 version was discussed which looked like it had some issues. Which vocabulary would support this REC?

-- MarkCresitelloDittmar - 2023-06-21

Grid & Web Services Working Group

Registry Working Group

Semantics Working Group

Data Curation & Preservation Interest Group

Education Interest Group

Knowledge Discovery Interest Group

Operations Interest Group

I don't strictly speaking speak for Operations IG as of this week, but since I did most of the review before my term expired, I'll fill it in here; the TCG can decide whether this counts as an Ops endorsement or not.

As one of the authors I'm basically happy with this document, but I will draw attention to one or two issues.

  • Section 2.2 defines the standardID for this standard as ivo://ivoa.net/std/DataLink#links-1.0, followed by the comment "Note this is applicable to endpoints following any version 1.* of the DataLink standard, to avoid backward compatibility problems." In my opinion the backward compatibility problems are not sufficient to justify this choice, and the minor version should be reflected in this standard ID, i.e. it should be "...#links-1.1". This has been discussed in the open github Issue #96, and other authors seem to agree. A fix will require at least an update to the StandardsRegExt record, and also changes in the document to places where the key is referenced, including Section 2.2 and, especially, Section 3.3.1 as well as related example text. This change would amongst other things make it possible for validators to check which minor version they are supposed to be validating against. PatDowler has volunteered to write a Pull Request addressing this issue, but I can have a go if he doesn't.
  • Section 3.3.1 REQUIRES an INFO defining a suitable standardID for links response tables. The example shows such an INFO element as a child of the RESOURCE/@type="results" element, but it's not clear what restrictions there are on the location - does it have to go there, or can it be elsewhere in the VOTable? This should be clarified. If it's not required to be a child of the results resource, the example text in this section should probably be cut down.
  • Section 3.2.2: The final sentence says "No other additional parameters or client handling are allowed." I don't understand what is meant here. Should this sentence be removed?
  • As mentioned in Issue #82, the recommended MIME type application/x-votable+xml;content=datalink is using a content-type parameter for VOTable not endorsed in the VOTable standard, which is a bit questionable; but this is not new in this version of DataLink and it will hopefully be addressed in VOTable 1.5, so turning a blind eye is probably OK.

-- MarkTaylor - 2023-05-15

Radio Astronomy Interest Group

Solar System Interest Group

Theory Interest Group

Time Domain Interest Group

Standards and Processes Committee


TCG Vote : 2023-05-19 - 2023-06-01

If you have minor comments (typos) on the last version of the document please indicate it in the Comments column of the table and post them in the TCG comments section above with the date.

Group Yes No Abstain Comments
TCG        
Apps        
DAL *      
DM        
GWS        
Registry        
Semantics        
DCP        
Edu        
KDIG        
Ops        
Radio        
SSIG        
Theory        
TD        
<nop>StdProc        



<!--
* Set ALLOWTOPICRENAME = TWikiAdminGroup
-->

META FILEATTACHMENT attachment="AladinDesktopDataLink1-1.png" attr="" comment="" date="1681920946" name="AladinDesktopDataLink1-1.png" path="AladinDesktopDataLink1-1.png" size="1238442" user="FrancoisBonnarel" version="1"

Revision 122023-07-10 - MarkusDemleitner

 
META TOPICPARENT name="IvoaTCG"

DataLink 1.1 Proposed Recommendation: Request for Comments


Introduction


DataLink describes the linking of data discovery metadata to access to the data itself, further detailed metadata, related resources, and to services that perform operations on the data.

The main changes in v1.1 are

  • Generalize by adding use cases for links to content other than data files
  • VOSI-availability and VOSI-capabilities endpoints are now optional
  • Service descriptors can include exampleURL and contentType param(s), as well as DESCRIPTION, name, etc...
  • Added optional link_auth and link_authorized to signal whether authentication is necessary to use the link
  • INFO element with standardID mandatory in {links} response
  • Added content_qualifier FIELD to inform on the nature of the link target
  • Added local_semantics to identify similar links in the same DataLink service for different IDs
  • Mechanisms to recognize {links} endpoints outside ObsCore

Latest version of DataLink can be found at:

The GitHub repository for issues and source can be found at: Detailed discussion towards 1-1 can also be found on this ivoa twiki page ( last update by FrancoisBonnarel - 2023-05-04):
Added:
>
>
-- MarkusDemleitner - 2023-07-10
 

Reference Interoperable Implementations

Server side

GAVO implemented the following changes = content_qualifier and local_semantics, service descriptor additional features such as DESCRIPTION? name, content_type, etc....

As a matter of example, a couple of links response for various obscore/sia/ssa tables in GAVO server

All these are combined with various semantics or content_type values


CADC has implemented the following in https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops:

  • INFO element with standardID in links response
  • new optional fields in links response: local_semantics (no content yet but can be populated with default vocab in most cases), content_qualifier (no content, not likely to use), link_auth, link_authorized
  • contentType param in service descriptors (where applicable)
IRIS image: https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/IRIS?f212h000/IRAS-25um shows link_auth=optional and link_authorized=true because one can authenticate but the data is public.

new CFHT data: anonymous use of https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/CFHT?2773629/2773629o shows link_auth=optional and link_authorized=false because the data is still proprietary and the caller is anonymous; if an authorized user makes the call they will see authorized=true. It's hard to demonstrate that for a general audience.

The core CADC implementation is available as a library (cadc-datalink-server) in MavenCentral with source code at https://github.com/opencadc/dal.git; the caom2-specific logic is available in a library (caom2-datalink-server) with source at https://github.com/opencadc/caom2service.git -- the core lib is also used in ALMA DataLink service but may not yet be released with the latest features.

Client side

TOPCAT v4.8-8 and later displays additional features in service descriptor and makes use of additional links table FIELD such as content_qualifier, authorization and local_semantics; The tool behavior is adapted to the content of these new FIELDS. For example Activation Actions suggested by topcat for the links not only depend on content_type but also content_qualifier, and local_semantics is used to guess which link a user is interested in based on previous selections.

AladinDesktop is going to adapt to those new features too. (see prototype screenshot)

The CADC DownloadManager (https://github.com/opencadc/apps.git) includes a simple DataLink client class so it can resolve publisherID values into 1..* URLs for download; this code hasn't changed as a result of DataLink -1.1. The CADC AdvancedSearch web portal makes calls to the above caom2ops/datalink service to find previews and download info for each row (publisherID): it makes use of link_authorized to decide to display the download options (or not), which prevents users from selecting downloads/links when they are not authorized and the request will be rejected later.

Added:
>
>
 

Implementations Validators

The following validators are available for DataLink

  • datalinklint which is part of STILTS. From STILTS v3.4-8 there is a version parameter that can be set to 1.0 or 1.1 (effectively defaults to v1.1)
  • Show my DataLink which is part of DaCHS



Comments from the IVOA Community during RFC/TCG review period: 2023-04-21 - 2023-05-18

The comments from the TCG members during the RFC/TCG review should be included in the next section.

In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your Wiki Name so that authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.

Additional discussion about any of the comments or responses can be conducted on the WG mailing list. However, please be sure to enter your initial comments here for full consideration in any future revisions of this document

Added:
>
>

Community Comment by Markus Demleitner

(1) The standard ID: I'm pretty sure we discussed that before, but I'm really unsure how we came to the conclusion that even Datalink 1.1 still has the ivoid of ivo://ivoa.net/std/DataLink#links-1.0.

Yes, we have to do crazy stuff like that for the schema URIs due to the way XML element names are compared. But there is in general no analogous need with ivoids, because we control the rules how to compare them in what situations.

Does anyone remember why we went for links-1.0 here? If not, I'd suggest links-1. I volunteer for adding a brief explanation about how clients should disregard the minor version for normal operations.

(2) I am entriely unhappy with section 3.1.1, starting with its title, which probably should be something like "Datalinks in VOTable columns". And then the first paragraph should probably say something more concrete like perhaps "Columns containing datalinks SHOULD be marked with a UCD of X.Y.Z and a LINK-typed child in its FIELD like this:

   <LINK whatever="blabla"/>"

And the second paragraph I'd say doesn't belong here at all (it could go to, perhaps, 1.2.7 or a use case discussing datalinks as primary results if we think we need to be explicit about this).

(3) In 3.2, is says:

If an error occurs while processing an ID value, there \rfcshould\ be at least one row for that ID value and an error\_message

The way the pyvo datalink client is written, we have to make that an unconditional MUST, or pyvo will keep requesting any failing ID (and frankly I'm unsure how else to implement this given multi-ID and overflows): it will only remove an ID of its list of ids to query if it gets at least one row back for it. Perhaps:

A service MUST return at least one row for each ID passed in.

[ceterum censeo we should have let ID be single-valued; it would have made everything soo much simpler and nothing really much harder/slower]

(4) 3.2.2, second paragraph: I had to puzzle quite a bit about this, starting with wondering what a "dereferenceable URL" might be. I'd suggest to replace the entire paragraph with "Access URLs may have fragment parts, which could, for instance, refer to id-ed elements within XML documents or extensions within FITS files. As in URIs in general, the interpretation of a fragment identifier depends on the media type."

This will also drop the "No other additional parameters or client handling are allowed." -- if this forbids query strings on access URIs, I'd strongly disagree. If this means something else, we'd have to write that something else.

(5) In my editoral PR, I've dropped a paragraph on semantics for error_message rows. This is now sufficiently addressed above that passage.

(6) sect. 3.2.9 content_qualifier: I think we should at least name the motivating use case a bit more precisely here, as in, perhaps: "It aids clients in presenting to the user the same sort of link as they go from one dataset to another within a service. For instance, suppose a service serves both continuum and line cubes. Using content_qualifier, users can configure their clients such that, as they change to a new data set, they always see the line cube even when the semantics and content\_type columns agree for both types of data." Or so.

(7) Sect. 4.8: Sorry, you cannot introduce a utype ("adhoc:this") in a section called "Example: X". If you are really, really sure these "self-describing" things are useful, put them into a section of their own.

Me, I've frankly never really understood where you want to go with this, and I think there's no implementation doing any of this, so perhaps we should drop the whole thing. But if we don't drop it and somehow nonchalantly mention it in an example, at least don't introduce a new utype here. What's wrong with the name="this" you had before? You see, having two different mechanisms for what to my knowledge hasn't been implemented even once seems a bit excessive.

When dropping adhoc:this, don't forget that it is referenced in sect 4.1.

(8) I have not looked at the DataLinkImp source that's also present in the repo. If you think this ought to become a document, please extract it to a different repo; ivoatex is not designed to support two documents in one repo.

I've also collected a few rather editorial changes in https://github.com/ivoa-std/DataLink/pull/108

 

Comments from TCG member during the RFC/TCG Review Period: 2023-04-21 - 2023-05-18

WG chairs or vice chairs must read the Document, provide comments if any (including on topics not directly linked to the Group matters) or indicate that they have no comment.

IG chairs or vice chairs are also encouraged to do the same, althought their inputs are not compulsory.

TCG Chair & Vice Chair

Applications Working Group

Data Access Layer Working Group

Some minor edits only, otherwise this update looks sound.

  1. Section references would be useful in the v1.1 changes list - PR#105 raised and merged
  2. Some minor grammar updates - PR #107 raised

-- JamesDempsey - 2023-07-03

Data Model Working Group

I have a rather rudimentary understanding of DataLink, VOSI and DALI, so there are some details that I'm glossing over in my read.

I don't see any real issues/conflicts with the DM group work. However, I have 2 points/questions to raise:

  1. local_semantics: This is an identifier from a local vocabulary to help identify/select rows at a finer level than possible with just the other tags (semantics, content_type, content_qualifier). I'm guessing this is for something like ObsCore 's dataproduct_subtype. My question is that I don't really understand what the value is... is it just the tag? or URI for the local vocabulary + tag? The example serializations are no help since the DaCHs ones seem to resolve into a pretty format and I can't see the actual datalink content, and the CADC examples don't populate this field. I'd like to see, either in the document or examples, something more concrete.
  2. Product Type vocabulary: This directly affects the DM group, it'd be used in the Dataset model and ObsCore could be updated to use it as well. The link in the standard resolves to a 2021 version of the vocabulary. At the interop, a 2023 version was discussed which looked like it had some issues. Which vocabulary would support this REC?

-- MarkCresitelloDittmar - 2023-06-21

Grid & Web Services Working Group

Registry Working Group

Semantics Working Group

Data Curation & Preservation Interest Group

Education Interest Group

Knowledge Discovery Interest Group

Operations Interest Group

I don't strictly speaking speak for Operations IG as of this week, but since I did most of the review before my term expired, I'll fill it in here; the TCG can decide whether this counts as an Ops endorsement or not.

As one of the authors I'm basically happy with this document, but I will draw attention to one or two issues.

  • Section 2.2 defines the standardID for this standard as ivo://ivoa.net/std/DataLink#links-1.0, followed by the comment "Note this is applicable to endpoints following any version 1.* of the DataLink standard, to avoid backward compatibility problems." In my opinion the backward compatibility problems are not sufficient to justify this choice, and the minor version should be reflected in this standard ID, i.e. it should be "...#links-1.1". This has been discussed in the open github Issue #96, and other authors seem to agree. A fix will require at least an update to the StandardsRegExt record, and also changes in the document to places where the key is referenced, including Section 2.2 and, especially, Section 3.3.1 as well as related example text. This change would amongst other things make it possible for validators to check which minor version they are supposed to be validating against. PatDowler has volunteered to write a Pull Request addressing this issue, but I can have a go if he doesn't.
  • Section 3.3.1 REQUIRES an INFO defining a suitable standardID for links response tables. The example shows such an INFO element as a child of the RESOURCE/@type="results" element, but it's not clear what restrictions there are on the location - does it have to go there, or can it be elsewhere in the VOTable? This should be clarified. If it's not required to be a child of the results resource, the example text in this section should probably be cut down.
  • Section 3.2.2: The final sentence says "No other additional parameters or client handling are allowed." I don't understand what is meant here. Should this sentence be removed?
  • As mentioned in Issue #82, the recommended MIME type application/x-votable+xml;content=datalink is using a content-type parameter for VOTable not endorsed in the VOTable standard, which is a bit questionable; but this is not new in this version of DataLink and it will hopefully be addressed in VOTable 1.5, so turning a blind eye is probably OK.

-- MarkTaylor - 2023-05-15

Radio Astronomy Interest Group

Solar System Interest Group

Theory Interest Group

Time Domain Interest Group

Standards and Processes Committee


TCG Vote : 2023-05-19 - 2023-06-01

If you have minor comments (typos) on the last version of the document please indicate it in the Comments column of the table and post them in the TCG comments section above with the date.

Group Yes No Abstain Comments
TCG        
Apps        
DAL *      
DM        
GWS        
Registry        
Semantics        
DCP        
Edu        
KDIG        
Ops        
Radio        
SSIG        
Theory        
TD        
<nop>StdProc        



<!--
* Set ALLOWTOPICRENAME = TWikiAdminGroup
-->

META FILEATTACHMENT attachment="AladinDesktopDataLink1-1.png" attr="" comment="" date="1681920946" name="AladinDesktopDataLink1-1.png" path="AladinDesktopDataLink1-1.png" size="1238442" user="FrancoisBonnarel" version="1"

Revision 112023-07-03 - JamesDempsey

 
META TOPICPARENT name="IvoaTCG"

DataLink 1.1 Proposed Recommendation: Request for Comments


Introduction


DataLink describes the linking of data discovery metadata to access to the data itself, further detailed metadata, related resources, and to services that perform operations on the data.

The main changes in v1.1 are

  • Generalize by adding use cases for links to content other than data files
  • VOSI-availability and VOSI-capabilities endpoints are now optional
  • Service descriptors can include exampleURL and contentType param(s), as well as DESCRIPTION, name, etc...
  • Added optional link_auth and link_authorized to signal whether authentication is necessary to use the link
  • INFO element with standardID mandatory in {links} response
  • Added content_qualifier FIELD to inform on the nature of the link target
  • Added local_semantics to identify similar links in the same DataLink service for different IDs
  • Mechanisms to recognize {links} endpoints outside ObsCore

Latest version of DataLink can be found at:

The GitHub repository for issues and source can be found at: Detailed discussion towards 1-1 can also be found on this ivoa twiki page ( last update by FrancoisBonnarel - 2023-05-04):

Reference Interoperable Implementations

Server side

GAVO implemented the following changes = content_qualifier and local_semantics, service descriptor additional features such as DESCRIPTION? name, content_type, etc....

As a matter of example, a couple of links response for various obscore/sia/ssa tables in GAVO server

All these are combined with various semantics or content_type values


CADC has implemented the following in https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops:

  • INFO element with standardID in links response
  • new optional fields in links response: local_semantics (no content yet but can be populated with default vocab in most cases), content_qualifier (no content, not likely to use), link_auth, link_authorized
  • contentType param in service descriptors (where applicable)
IRIS image: https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/IRIS?f212h000/IRAS-25um shows link_auth=optional and link_authorized=true because one can authenticate but the data is public.

new CFHT data: anonymous use of https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/CFHT?2773629/2773629o shows link_auth=optional and link_authorized=false because the data is still proprietary and the caller is anonymous; if an authorized user makes the call they will see authorized=true. It's hard to demonstrate that for a general audience.

The core CADC implementation is available as a library (cadc-datalink-server) in MavenCentral with source code at https://github.com/opencadc/dal.git; the caom2-specific logic is available in a library (caom2-datalink-server) with source at https://github.com/opencadc/caom2service.git -- the core lib is also used in ALMA DataLink service but may not yet be released with the latest features.

Client side

TOPCAT v4.8-8 and later displays additional features in service descriptor and makes use of additional links table FIELD such as content_qualifier, authorization and local_semantics; The tool behavior is adapted to the content of these new FIELDS. For example Activation Actions suggested by topcat for the links not only depend on content_type but also content_qualifier, and local_semantics is used to guess which link a user is interested in based on previous selections.

AladinDesktop is going to adapt to those new features too. (see prototype screenshot)

The CADC DownloadManager (https://github.com/opencadc/apps.git) includes a simple DataLink client class so it can resolve publisherID values into 1..* URLs for download; this code hasn't changed as a result of DataLink -1.1. The CADC AdvancedSearch web portal makes calls to the above caom2ops/datalink service to find previews and download info for each row (publisherID): it makes use of link_authorized to decide to display the download options (or not), which prevents users from selecting downloads/links when they are not authorized and the request will be rejected later.

Implementations Validators

The following validators are available for DataLink

  • datalinklint which is part of STILTS. From STILTS v3.4-8 there is a version parameter that can be set to 1.0 or 1.1 (effectively defaults to v1.1)
  • Show my DataLink which is part of DaCHS



Comments from the IVOA Community during RFC/TCG review period: 2023-04-21 - 2023-05-18

The comments from the TCG members during the RFC/TCG review should be included in the next section.

In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your Wiki Name so that authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.

Additional discussion about any of the comments or responses can be conducted on the WG mailing list. However, please be sure to enter your initial comments here for full consideration in any future revisions of this document



Comments from TCG member during the RFC/TCG Review Period: 2023-04-21 - 2023-05-18

WG chairs or vice chairs must read the Document, provide comments if any (including on topics not directly linked to the Group matters) or indicate that they have no comment.

IG chairs or vice chairs are also encouraged to do the same, althought their inputs are not compulsory.

TCG Chair & Vice Chair

Applications Working Group

Data Access Layer Working Group

Added:
>
>
Some minor edits only, otherwise this update looks sound.
  1. Section references would be useful in the v1.1 changes list - PR#105 raised and merged
  2. Some minor grammar updates - PR #107 raised

-- JamesDempsey - 2023-07-03

 

Data Model Working Group

I have a rather rudimentary understanding of DataLink, VOSI and DALI, so there are some details that I'm glossing over in my read.

I don't see any real issues/conflicts with the DM group work. However, I have 2 points/questions to raise:

  1. local_semantics: This is an identifier from a local vocabulary to help identify/select rows at a finer level than possible with just the other tags (semantics, content_type, content_qualifier). I'm guessing this is for something like ObsCore 's dataproduct_subtype. My question is that I don't really understand what the value is... is it just the tag? or URI for the local vocabulary + tag? The example serializations are no help since the DaCHs ones seem to resolve into a pretty format and I can't see the actual datalink content, and the CADC examples don't populate this field. I'd like to see, either in the document or examples, something more concrete.
  2. Product Type vocabulary: This directly affects the DM group, it'd be used in the Dataset model and ObsCore could be updated to use it as well. The link in the standard resolves to a 2021 version of the vocabulary. At the interop, a 2023 version was discussed which looked like it had some issues. Which vocabulary would support this REC?

-- MarkCresitelloDittmar - 2023-06-21

Grid & Web Services Working Group

Registry Working Group

Semantics Working Group

Data Curation & Preservation Interest Group

Education Interest Group

Knowledge Discovery Interest Group

Operations Interest Group

I don't strictly speaking speak for Operations IG as of this week, but since I did most of the review before my term expired, I'll fill it in here; the TCG can decide whether this counts as an Ops endorsement or not.

As one of the authors I'm basically happy with this document, but I will draw attention to one or two issues.

  • Section 2.2 defines the standardID for this standard as ivo://ivoa.net/std/DataLink#links-1.0, followed by the comment "Note this is applicable to endpoints following any version 1.* of the DataLink standard, to avoid backward compatibility problems." In my opinion the backward compatibility problems are not sufficient to justify this choice, and the minor version should be reflected in this standard ID, i.e. it should be "...#links-1.1". This has been discussed in the open github Issue #96, and other authors seem to agree. A fix will require at least an update to the StandardsRegExt record, and also changes in the document to places where the key is referenced, including Section 2.2 and, especially, Section 3.3.1 as well as related example text. This change would amongst other things make it possible for validators to check which minor version they are supposed to be validating against. PatDowler has volunteered to write a Pull Request addressing this issue, but I can have a go if he doesn't.
  • Section 3.3.1 REQUIRES an INFO defining a suitable standardID for links response tables. The example shows such an INFO element as a child of the RESOURCE/@type="results" element, but it's not clear what restrictions there are on the location - does it have to go there, or can it be elsewhere in the VOTable? This should be clarified. If it's not required to be a child of the results resource, the example text in this section should probably be cut down.
  • Section 3.2.2: The final sentence says "No other additional parameters or client handling are allowed." I don't understand what is meant here. Should this sentence be removed?
  • As mentioned in Issue #82, the recommended MIME type application/x-votable+xml;content=datalink is using a content-type parameter for VOTable not endorsed in the VOTable standard, which is a bit questionable; but this is not new in this version of DataLink and it will hopefully be addressed in VOTable 1.5, so turning a blind eye is probably OK.

-- MarkTaylor - 2023-05-15

Radio Astronomy Interest Group

Solar System Interest Group

Theory Interest Group

Time Domain Interest Group

Standards and Processes Committee


TCG Vote : 2023-05-19 - 2023-06-01

If you have minor comments (typos) on the last version of the document please indicate it in the Comments column of the table and post them in the TCG comments section above with the date.

Group Yes No Abstain Comments
TCG        
Apps        
Changed:
<
<
DAL        
>
>
DAL *      
 
DM        
GWS        
Registry        
Semantics        
DCP        
Edu        
KDIG        
Ops        
Radio        
SSIG        
Theory        
TD        
<nop>StdProc        



<!--
* Set ALLOWTOPICRENAME = TWikiAdminGroup
-->

META FILEATTACHMENT attachment="AladinDesktopDataLink1-1.png" attr="" comment="" date="1681920946" name="AladinDesktopDataLink1-1.png" path="AladinDesktopDataLink1-1.png" size="1238442" user="FrancoisBonnarel" version="1"

Revision 102023-06-21 - MarkCresitelloDittmar

 
META TOPICPARENT name="IvoaTCG"

DataLink 1.1 Proposed Recommendation: Request for Comments


Introduction


DataLink describes the linking of data discovery metadata to access to the data itself, further detailed metadata, related resources, and to services that perform operations on the data.

The main changes in v1.1 are

  • Generalize by adding use cases for links to content other than data files
  • VOSI-availability and VOSI-capabilities endpoints are now optional
  • Service descriptors can include exampleURL and contentType param(s), as well as DESCRIPTION, name, etc...
  • Added optional link_auth and link_authorized to signal whether authentication is necessary to use the link
  • INFO element with standardID mandatory in {links} response
  • Added content_qualifier FIELD to inform on the nature of the link target
  • Added local_semantics to identify similar links in the same DataLink service for different IDs
  • Mechanisms to recognize {links} endpoints outside ObsCore

Latest version of DataLink can be found at:

The GitHub repository for issues and source can be found at: Detailed discussion towards 1-1 can also be found on this ivoa twiki page ( last update by FrancoisBonnarel - 2023-05-04):

Reference Interoperable Implementations

Server side

GAVO implemented the following changes = content_qualifier and local_semantics, service descriptor additional features such as DESCRIPTION? name, content_type, etc....

As a matter of example, a couple of links response for various obscore/sia/ssa tables in GAVO server

All these are combined with various semantics or content_type values


CADC has implemented the following in https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops:

  • INFO element with standardID in links response
  • new optional fields in links response: local_semantics (no content yet but can be populated with default vocab in most cases), content_qualifier (no content, not likely to use), link_auth, link_authorized
  • contentType param in service descriptors (where applicable)
IRIS image: https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/IRIS?f212h000/IRAS-25um shows link_auth=optional and link_authorized=true because one can authenticate but the data is public.

new CFHT data: anonymous use of https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/CFHT?2773629/2773629o shows link_auth=optional and link_authorized=false because the data is still proprietary and the caller is anonymous; if an authorized user makes the call they will see authorized=true. It's hard to demonstrate that for a general audience.

The core CADC implementation is available as a library (cadc-datalink-server) in MavenCentral with source code at https://github.com/opencadc/dal.git; the caom2-specific logic is available in a library (caom2-datalink-server) with source at https://github.com/opencadc/caom2service.git -- the core lib is also used in ALMA DataLink service but may not yet be released with the latest features.

Client side

TOPCAT v4.8-8 and later displays additional features in service descriptor and makes use of additional links table FIELD such as content_qualifier, authorization and local_semantics; The tool behavior is adapted to the content of these new FIELDS. For example Activation Actions suggested by topcat for the links not only depend on content_type but also content_qualifier, and local_semantics is used to guess which link a user is interested in based on previous selections.

AladinDesktop is going to adapt to those new features too. (see prototype screenshot)

The CADC DownloadManager (https://github.com/opencadc/apps.git) includes a simple DataLink client class so it can resolve publisherID values into 1..* URLs for download; this code hasn't changed as a result of DataLink -1.1. The CADC AdvancedSearch web portal makes calls to the above caom2ops/datalink service to find previews and download info for each row (publisherID): it makes use of link_authorized to decide to display the download options (or not), which prevents users from selecting downloads/links when they are not authorized and the request will be rejected later.

Implementations Validators

The following validators are available for DataLink

Changed:
<
<
  • datalinklint which is part of STILTS. From STILTS v3.4-8 there is a version parameter that can be set to 1.0 or 1.1 (effectively defaults to v1.1)
>
>
  • datalinklint which is part of STILTS. From STILTS v3.4-8 there is a version parameter that can be set to 1.0 or 1.1 (effectively defaults to v1.1)
 



Comments from the IVOA Community during RFC/TCG review period: 2023-04-21 - 2023-05-18

The comments from the TCG members during the RFC/TCG review should be included in the next section.

In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your Wiki Name so that authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.

Additional discussion about any of the comments or responses can be conducted on the WG mailing list. However, please be sure to enter your initial comments here for full consideration in any future revisions of this document



Comments from TCG member during the RFC/TCG Review Period: 2023-04-21 - 2023-05-18

WG chairs or vice chairs must read the Document, provide comments if any (including on topics not directly linked to the Group matters) or indicate that they have no comment.

IG chairs or vice chairs are also encouraged to do the same, althought their inputs are not compulsory.

TCG Chair & Vice Chair

Applications Working Group

Data Access Layer Working Group

Data Model Working Group

Added:
>
>
I have a rather rudimentary understanding of DataLink, VOSI and DALI, so there are some details that I'm glossing over in my read.

I don't see any real issues/conflicts with the DM group work. However, I have 2 points/questions to raise:

  1. local_semantics: This is an identifier from a local vocabulary to help identify/select rows at a finer level than possible with just the other tags (semantics, content_type, content_qualifier). I'm guessing this is for something like ObsCore 's dataproduct_subtype. My question is that I don't really understand what the value is... is it just the tag? or URI for the local vocabulary + tag? The example serializations are no help since the DaCHs ones seem to resolve into a pretty format and I can't see the actual datalink content, and the CADC examples don't populate this field. I'd like to see, either in the document or examples, something more concrete.
  2. Product Type vocabulary: This directly affects the DM group, it'd be used in the Dataset model and ObsCore could be updated to use it as well. The link in the standard resolves to a 2021 version of the vocabulary. At the interop, a 2023 version was discussed which looked like it had some issues. Which vocabulary would support this REC?

-- MarkCresitelloDittmar - 2023-06-21

 

Grid & Web Services Working Group

Registry Working Group

Semantics Working Group

Data Curation & Preservation Interest Group

Education Interest Group

Knowledge Discovery Interest Group

Operations Interest Group

I don't strictly speaking speak for Operations IG as of this week, but since I did most of the review before my term expired, I'll fill it in here; the TCG can decide whether this counts as an Ops endorsement or not.

As one of the authors I'm basically happy with this document, but I will draw attention to one or two issues.

Changed:
<
<
  • Section 2.2 defines the standardID for this standard as ivo://ivoa.net/std/DataLink#links-1.0, followed by the comment "Note this is applicable to endpoints following any version 1.* of the DataLink standard, to avoid backward compatibility problems." In my opinion the backward compatibility problems are not sufficient to justify this choice, and the minor version should be reflected in this standard ID, i.e. it should be "...#links-1.1". This has been discussed in the open github Issue #96, and other authors seem to agree. A fix will require at least an update to the StandardsRegExt record, and also changes in the document to places where the key is referenced, including Section 2.2 and, especially, Section 3.3.1 as well as related example text. This change would amongst other things make it possible for validators to check which minor version they are supposed to be validating against. PatDowler has volunteered to write a Pull Request addressing this issue, but I can have a go if he doesn't.
  • Section 3.3.1 REQUIRES an INFO defining a suitable standardID for links response tables. The example shows such an INFO element as a child of the RESOURCE/@type="results" element, but it's not clear what restrictions there are on the location - does it have to go there, or can it be elsewhere in the VOTable? This should be clarified. If it's not required to be a child of the results resource, the example text in this section should probably be cut down.
  • Section 3.2.2: The final sentence says "No other additional parameters or client handling are allowed." I don't understand what is meant here. Should this sentence be removed?
  • As mentioned in Issue #82, the recommended MIME type application/x-votable+xml;content=datalink is using a content-type parameter for VOTable not endorsed in the VOTable standard, which is a bit questionable; but this is not new in this version of DataLink and it will hopefully be addressed in VOTable 1.5, so turning a blind eye is probably OK.
>
>
  • Section 2.2 defines the standardID for this standard as ivo://ivoa.net/std/DataLink#links-1.0, followed by the comment "Note this is applicable to endpoints following any version 1.* of the DataLink standard, to avoid backward compatibility problems." In my opinion the backward compatibility problems are not sufficient to justify this choice, and the minor version should be reflected in this standard ID, i.e. it should be "...#links-1.1". This has been discussed in the open github Issue #96, and other authors seem to agree. A fix will require at least an update to the StandardsRegExt record, and also changes in the document to places where the key is referenced, including Section 2.2 and, especially, Section 3.3.1 as well as related example text. This change would amongst other things make it possible for validators to check which minor version they are supposed to be validating against. PatDowler has volunteered to write a Pull Request addressing this issue, but I can have a go if he doesn't.
  • Section 3.3.1 REQUIRES an INFO defining a suitable standardID for links response tables. The example shows such an INFO element as a child of the RESOURCE/@type="results" element, but it's not clear what restrictions there are on the location - does it have to go there, or can it be elsewhere in the VOTable? This should be clarified. If it's not required to be a child of the results resource, the example text in this section should probably be cut down.
  • Section 3.2.2: The final sentence says "No other additional parameters or client handling are allowed." I don't understand what is meant here. Should this sentence be removed?
  • As mentioned in Issue #82, the recommended MIME type application/x-votable+xml;content=datalink is using a content-type parameter for VOTable not endorsed in the VOTable standard, which is a bit questionable; but this is not new in this version of DataLink and it will hopefully be addressed in VOTable 1.5, so turning a blind eye is probably OK.
 
Changed:
<
<
-- MarkTaylor - 2023-05-15
>
>
-- MarkTaylor - 2023-05-15
 

Radio Astronomy Interest Group

Solar System Interest Group

Theory Interest Group

Time Domain Interest Group

Standards and Processes Committee


TCG Vote : 2023-05-19 - 2023-06-01

If you have minor comments (typos) on the last version of the document please indicate it in the Comments column of the table and post them in the TCG comments section above with the date.

Group Yes No Abstain Comments
TCG        
Apps        
DAL        
DM        
GWS        
Registry        
Semantics        
DCP        
Edu        
KDIG        
Ops        
Radio        
SSIG        
Theory        
TD        
<nop>StdProc        


Changed:
<
<

<!--
* Set ALLOWTOPICRENAME = TWikiAdminGroup
-->
>
>

<!--
* Set ALLOWTOPICRENAME = TWikiAdminGroup
-->
 
META FILEATTACHMENT attachment="AladinDesktopDataLink1-1.png" attr="" comment="" date="1681920946" name="AladinDesktopDataLink1-1.png" path="AladinDesktopDataLink1-1.png" size="1238442" user="FrancoisBonnarel" version="1"

Revision 92023-05-15 - MarkTaylor

 
META TOPICPARENT name="IvoaTCG"

DataLink 1.1 Proposed Recommendation: Request for Comments


Introduction


DataLink describes the linking of data discovery metadata to access to the data itself, further detailed metadata, related resources, and to services that perform operations on the data.

The main changes in v1.1 are

  • Generalize by adding use cases for links to content other than data files
  • VOSI-availability and VOSI-capabilities endpoints are now optional
  • Service descriptors can include exampleURL and contentType param(s), as well as DESCRIPTION, name, etc...
  • Added optional link_auth and link_authorized to signal whether authentication is necessary to use the link
  • INFO element with standardID mandatory in {links} response
  • Added content_qualifier FIELD to inform on the nature of the link target
  • Added local_semantics to identify similar links in the same DataLink service for different IDs
  • Mechanisms to recognize {links} endpoints outside ObsCore

Latest version of DataLink can be found at:

The GitHub repository for issues and source can be found at: Detailed discussion towards 1-1 can also be found on this ivoa twiki page ( last update by FrancoisBonnarel - 2023-05-04):

Reference Interoperable Implementations

Server side

GAVO implemented the following changes = content_qualifier and local_semantics, service descriptor additional features such as DESCRIPTION? name, content_type, etc....

As a matter of example, a couple of links response for various obscore/sia/ssa tables in GAVO server

All these are combined with various semantics or content_type values


CADC has implemented the following in https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops:

  • INFO element with standardID in links response
  • new optional fields in links response: local_semantics (no content yet but can be populated with default vocab in most cases), content_qualifier (no content, not likely to use), link_auth, link_authorized
  • contentType param in service descriptors (where applicable)
IRIS image: https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/IRIS?f212h000/IRAS-25um shows link_auth=optional and link_authorized=true because one can authenticate but the data is public.

new CFHT data: anonymous use of https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/CFHT?2773629/2773629o shows link_auth=optional and link_authorized=false because the data is still proprietary and the caller is anonymous; if an authorized user makes the call they will see authorized=true. It's hard to demonstrate that for a general audience.

The core CADC implementation is available as a library (cadc-datalink-server) in MavenCentral with source code at https://github.com/opencadc/dal.git; the caom2-specific logic is available in a library (caom2-datalink-server) with source at https://github.com/opencadc/caom2service.git -- the core lib is also used in ALMA DataLink service but may not yet be released with the latest features.

Client side

TOPCAT v4.8-8 and later displays additional features in service descriptor and makes use of additional links table FIELD such as content_qualifier, authorization and local_semantics; The tool behavior is adapted to the content of these new FIELDS. For example Activation Actions suggested by topcat for the links not only depend on content_type but also content_qualifier, and local_semantics is used to guess which link a user is interested in based on previous selections.

AladinDesktop is going to adapt to those new features too. (see prototype screenshot)

The CADC DownloadManager (https://github.com/opencadc/apps.git) includes a simple DataLink client class so it can resolve publisherID values into 1..* URLs for download; this code hasn't changed as a result of DataLink -1.1. The CADC AdvancedSearch web portal makes calls to the above caom2ops/datalink service to find previews and download info for each row (publisherID): it makes use of link_authorized to decide to display the download options (or not), which prevents users from selecting downloads/links when they are not authorized and the request will be rejected later.

Implementations Validators

The following validators are available for DataLink

  • datalinklint which is part of STILTS. From STILTS v3.4-8 there is a version parameter that can be set to 1.0 or 1.1 (effectively defaults to v1.1)
  • Show my DataLink which is part of DaCHS



Comments from the IVOA Community during RFC/TCG review period: 2023-04-21 - 2023-05-18

The comments from the TCG members during the RFC/TCG review should be included in the next section.

In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your Wiki Name so that authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.

Additional discussion about any of the comments or responses can be conducted on the WG mailing list. However, please be sure to enter your initial comments here for full consideration in any future revisions of this document



Comments from TCG member during the RFC/TCG Review Period: 2023-04-21 - 2023-05-18

WG chairs or vice chairs must read the Document, provide comments if any (including on topics not directly linked to the Group matters) or indicate that they have no comment.

IG chairs or vice chairs are also encouraged to do the same, althought their inputs are not compulsory.

TCG Chair & Vice Chair

Applications Working Group

Data Access Layer Working Group

Data Model Working Group

Grid & Web Services Working Group

Registry Working Group

Semantics Working Group

Data Curation & Preservation Interest Group

Education Interest Group

Knowledge Discovery Interest Group

Operations Interest Group

Added:
>
>
I don't strictly speaking speak for Operations IG as of this week, but since I did most of the review before my term expired, I'll fill it in here; the TCG can decide whether this counts as an Ops endorsement or not.

As one of the authors I'm basically happy with this document, but I will draw attention to one or two issues.

  • Section 2.2 defines the standardID for this standard as ivo://ivoa.net/std/DataLink#links-1.0, followed by the comment "Note this is applicable to endpoints following any version 1.* of the DataLink standard, to avoid backward compatibility problems." In my opinion the backward compatibility problems are not sufficient to justify this choice, and the minor version should be reflected in this standard ID, i.e. it should be "...#links-1.1". This has been discussed in the open github Issue #96, and other authors seem to agree. A fix will require at least an update to the StandardsRegExt record, and also changes in the document to places where the key is referenced, including Section 2.2 and, especially, Section 3.3.1 as well as related example text. This change would amongst other things make it possible for validators to check which minor version they are supposed to be validating against. PatDowler has volunteered to write a Pull Request addressing this issue, but I can have a go if he doesn't.
  • Section 3.3.1 REQUIRES an INFO defining a suitable standardID for links response tables. The example shows such an INFO element as a child of the RESOURCE/@type="results" element, but it's not clear what restrictions there are on the location - does it have to go there, or can it be elsewhere in the VOTable? This should be clarified. If it's not required to be a child of the results resource, the example text in this section should probably be cut down.
  • Section 3.2.2: The final sentence says "No other additional parameters or client handling are allowed." I don't understand what is meant here. Should this sentence be removed?
  • As mentioned in Issue #82, the recommended MIME type application/x-votable+xml;content=datalink is using a content-type parameter for VOTable not endorsed in the VOTable standard, which is a bit questionable; but this is not new in this version of DataLink and it will hopefully be addressed in VOTable 1.5, so turning a blind eye is probably OK.

-- MarkTaylor - 2023-05-15

 

Radio Astronomy Interest Group

Solar System Interest Group

Theory Interest Group

Time Domain Interest Group

Standards and Processes Committee


TCG Vote : 2023-05-19 - 2023-06-01

If you have minor comments (typos) on the last version of the document please indicate it in the Comments column of the table and post them in the TCG comments section above with the date.

Group Yes No Abstain Comments
TCG        
Apps        
DAL        
DM        
GWS        
Registry        
Semantics        
DCP        
Edu        
KDIG        
Ops        
Radio        
SSIG        
Theory        
TD        
<nop>StdProc        



<!--
* Set ALLOWTOPICRENAME = TWikiAdminGroup
-->

META FILEATTACHMENT attachment="AladinDesktopDataLink1-1.png" attr="" comment="" date="1681920946" name="AladinDesktopDataLink1-1.png" path="AladinDesktopDataLink1-1.png" size="1238442" user="FrancoisBonnarel" version="1"

Revision 82023-05-04 - FrancoisBonnarel

 
META TOPICPARENT name="IvoaTCG"

DataLink 1.1 Proposed Recommendation: Request for Comments


Introduction


DataLink describes the linking of data discovery metadata to access to the data itself, further detailed metadata, related resources, and to services that perform operations on the data.

The main changes in v1.1 are

  • Generalize by adding use cases for links to content other than data files
  • VOSI-availability and VOSI-capabilities endpoints are now optional
  • Service descriptors can include exampleURL and contentType param(s), as well as DESCRIPTION, name, etc...
  • Added optional link_auth and link_authorized to signal whether authentication is necessary to use the link
  • INFO element with standardID mandatory in {links} response
  • Added content_qualifier FIELD to inform on the nature of the link target
  • Added local_semantics to identify similar links in the same DataLink service for different IDs
  • Mechanisms to recognize {links} endpoints outside ObsCore

Latest version of DataLink can be found at:

The GitHub repository for issues and source can be found at:
Added:
>
>
Detailed discussion towards 1-1 can also be found on this ivoa twiki page ( last update by FrancoisBonnarel - 2023-05-04):
 

Reference Interoperable Implementations

Server side

GAVO implemented the following changes = content_qualifier and local_semantics, service descriptor additional features such as DESCRIPTION? name, content_type, etc....

As a matter of example, a couple of links response for various obscore/sia/ssa tables in GAVO server

All these are combined with various semantics or content_type values


CADC has implemented the following in https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops:

  • INFO element with standardID in links response
  • new optional fields in links response: local_semantics (no content yet but can be populated with default vocab in most cases), content_qualifier (no content, not likely to use), link_auth, link_authorized
  • contentType param in service descriptors (where applicable)
IRIS image: https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/IRIS?f212h000/IRAS-25um shows link_auth=optional and link_authorized=true because one can authenticate but the data is public.

new CFHT data: anonymous use of https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/CFHT?2773629/2773629o shows link_auth=optional and link_authorized=false because the data is still proprietary and the caller is anonymous; if an authorized user makes the call they will see authorized=true. It's hard to demonstrate that for a general audience.

The core CADC implementation is available as a library (cadc-datalink-server) in MavenCentral with source code at https://github.com/opencadc/dal.git; the caom2-specific logic is available in a library (caom2-datalink-server) with source at https://github.com/opencadc/caom2service.git -- the core lib is also used in ALMA DataLink service but may not yet be released with the latest features.

Client side

TOPCAT v4.8-8 and later displays additional features in service descriptor and makes use of additional links table FIELD such as content_qualifier, authorization and local_semantics; The tool behavior is adapted to the content of these new FIELDS. For example Activation Actions suggested by topcat for the links not only depend on content_type but also content_qualifier, and local_semantics is used to guess which link a user is interested in based on previous selections.

AladinDesktop is going to adapt to those new features too. (see prototype screenshot)

The CADC DownloadManager (https://github.com/opencadc/apps.git) includes a simple DataLink client class so it can resolve publisherID values into 1..* URLs for download; this code hasn't changed as a result of DataLink -1.1. The CADC AdvancedSearch web portal makes calls to the above caom2ops/datalink service to find previews and download info for each row (publisherID): it makes use of link_authorized to decide to display the download options (or not), which prevents users from selecting downloads/links when they are not authorized and the request will be rejected later.

Implementations Validators

The following validators are available for DataLink

  • datalinklint which is part of STILTS. From STILTS v3.4-8 there is a version parameter that can be set to 1.0 or 1.1 (effectively defaults to v1.1)
  • Show my DataLink which is part of DaCHS



Comments from the IVOA Community during RFC/TCG review period: 2023-04-21 - 2023-05-18

The comments from the TCG members during the RFC/TCG review should be included in the next section.

In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your Wiki Name so that authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.

Additional discussion about any of the comments or responses can be conducted on the WG mailing list. However, please be sure to enter your initial comments here for full consideration in any future revisions of this document



Comments from TCG member during the RFC/TCG Review Period: 2023-04-21 - 2023-05-18

WG chairs or vice chairs must read the Document, provide comments if any (including on topics not directly linked to the Group matters) or indicate that they have no comment.

IG chairs or vice chairs are also encouraged to do the same, althought their inputs are not compulsory.

TCG Chair & Vice Chair

Applications Working Group

Data Access Layer Working Group

Data Model Working Group

Grid & Web Services Working Group

Registry Working Group

Semantics Working Group

Data Curation & Preservation Interest Group

Education Interest Group

Knowledge Discovery Interest Group

Operations Interest Group

Radio Astronomy Interest Group

Solar System Interest Group

Theory Interest Group

Time Domain Interest Group

Standards and Processes Committee


TCG Vote : 2023-05-19 - 2023-06-01

If you have minor comments (typos) on the last version of the document please indicate it in the Comments column of the table and post them in the TCG comments section above with the date.

Group Yes No Abstain Comments
TCG        
Apps        
DAL        
DM        
GWS        
Registry        
Semantics        
DCP        
Edu        
KDIG        
Ops        
Radio        
SSIG        
Theory        
TD        
<nop>StdProc        



<!--
* Set ALLOWTOPICRENAME = TWikiAdminGroup
-->

META FILEATTACHMENT attachment="AladinDesktopDataLink1-1.png" attr="" comment="" date="1681920946" name="AladinDesktopDataLink1-1.png" path="AladinDesktopDataLink1-1.png" size="1238442" user="FrancoisBonnarel" version="1"

Revision 72023-04-24 - JamesDempsey

Changed:
<
<
META TOPICPARENT name="DataLink-1_0-Next"
>
>
META TOPICPARENT name="IvoaTCG"
 

DataLink 1.1 Proposed Recommendation: Request for Comments


Introduction


DataLink describes the linking of data discovery metadata to access to the data itself, further detailed metadata, related resources, and to services that perform operations on the data.

The main changes in v1.1 are

  • Generalize by adding use cases for links to content other than data files
  • VOSI-availability and VOSI-capabilities endpoints are now optional
  • Service descriptors can include exampleURL and contentType param(s), as well as DESCRIPTION, name, etc...
  • Added optional link_auth and link_authorized to signal whether authentication is necessary to use the link
  • INFO element with standardID mandatory in {links} response
  • Added content_qualifier FIELD to inform on the nature of the link target
  • Added local_semantics to identify similar links in the same DataLink service for different IDs
  • Mechanisms to recognize {links} endpoints outside ObsCore

Latest version of DataLink can be found at:

The GitHub repository for issues and source can be found at:

Reference Interoperable Implementations

Server side

GAVO implemented the following changes = content_qualifier and local_semantics, service descriptor additional features such as DESCRIPTION? name, content_type, etc....

As a matter of example, a couple of links response for various obscore/sia/ssa tables in GAVO server

All these are combined with various semantics or content_type values


CADC has implemented the following in https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops:

  • INFO element with standardID in links response
  • new optional fields in links response: local_semantics (no content yet but can be populated with default vocab in most cases), content_qualifier (no content, not likely to use), link_auth, link_authorized
  • contentType param in service descriptors (where applicable)
IRIS image: https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/IRIS?f212h000/IRAS-25um shows link_auth=optional and link_authorized=true because one can authenticate but the data is public.

new CFHT data: anonymous use of https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/CFHT?2773629/2773629o shows link_auth=optional and link_authorized=false because the data is still proprietary and the caller is anonymous; if an authorized user makes the call they will see authorized=true. It's hard to demonstrate that for a general audience.

The core CADC implementation is available as a library (cadc-datalink-server) in MavenCentral with source code at https://github.com/opencadc/dal.git; the caom2-specific logic is available in a library (caom2-datalink-server) with source at https://github.com/opencadc/caom2service.git -- the core lib is also used in ALMA DataLink service but may not yet be released with the latest features.

Client side

TOPCAT v4.8-8 and later displays additional features in service descriptor and makes use of additional links table FIELD such as content_qualifier, authorization and local_semantics; The tool behavior is adapted to the content of these new FIELDS. For example Activation Actions suggested by topcat for the links not only depend on content_type but also content_qualifier, and local_semantics is used to guess which link a user is interested in based on previous selections.

AladinDesktop is going to adapt to those new features too. (see prototype screenshot)

The CADC DownloadManager (https://github.com/opencadc/apps.git) includes a simple DataLink client class so it can resolve publisherID values into 1..* URLs for download; this code hasn't changed as a result of DataLink -1.1. The CADC AdvancedSearch web portal makes calls to the above caom2ops/datalink service to find previews and download info for each row (publisherID): it makes use of link_authorized to decide to display the download options (or not), which prevents users from selecting downloads/links when they are not authorized and the request will be rejected later.

Implementations Validators

The following validators are available for DataLink

  • datalinklint which is part of STILTS. From STILTS v3.4-8 there is a version parameter that can be set to 1.0 or 1.1 (effectively defaults to v1.1)
  • Show my DataLink which is part of DaCHS



Comments from the IVOA Community during RFC/TCG review period: 2023-04-21 - 2023-05-18

The comments from the TCG members during the RFC/TCG review should be included in the next section.

In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your Wiki Name so that authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.

Additional discussion about any of the comments or responses can be conducted on the WG mailing list. However, please be sure to enter your initial comments here for full consideration in any future revisions of this document



Comments from TCG member during the RFC/TCG Review Period: 2023-04-21 - 2023-05-18

WG chairs or vice chairs must read the Document, provide comments if any (including on topics not directly linked to the Group matters) or indicate that they have no comment.

IG chairs or vice chairs are also encouraged to do the same, althought their inputs are not compulsory.

TCG Chair & Vice Chair

Applications Working Group

Data Access Layer Working Group

Data Model Working Group

Grid & Web Services Working Group

Registry Working Group

Semantics Working Group

Data Curation & Preservation Interest Group

Education Interest Group

Knowledge Discovery Interest Group

Operations Interest Group

Radio Astronomy Interest Group

Solar System Interest Group

Theory Interest Group

Time Domain Interest Group

Standards and Processes Committee


TCG Vote : 2023-05-19 - 2023-06-01

If you have minor comments (typos) on the last version of the document please indicate it in the Comments column of the table and post them in the TCG comments section above with the date.

Group Yes No Abstain Comments
TCG        
Apps        
DAL        
DM        
GWS        
Registry        
Semantics        
DCP        
Edu        
KDIG        
Ops        
Radio        
SSIG        
Theory        
TD        
<nop>StdProc        



<!--
* Set ALLOWTOPICRENAME = TWikiAdminGroup
-->

META FILEATTACHMENT attachment="AladinDesktopDataLink1-1.png" attr="" comment="" date="1681920946" name="AladinDesktopDataLink1-1.png" path="AladinDesktopDataLink1-1.png" size="1238442" user="FrancoisBonnarel" version="1"

Revision 62023-04-21 - MarkTaylor

 
META TOPICPARENT name="DataLink-1_0-Next"

DataLink 1.1 Proposed Recommendation: Request for Comments


Introduction


DataLink describes the linking of data discovery metadata to access to the data itself, further detailed metadata, related resources, and to services that perform operations on the data.

The main changes in v1.1 are

  • Generalize by adding use cases for links to content other than data files
  • VOSI-availability and VOSI-capabilities endpoints are now optional
  • Service descriptors can include exampleURL and contentType param(s), as well as DESCRIPTION, name, etc...
  • Added optional link_auth and link_authorized to signal whether authentication is necessary to use the link
  • INFO element with standardID mandatory in {links} response
  • Added content_qualifier FIELD to inform on the nature of the link target
  • Added local_semantics to identify similar links in the same DataLink service for different IDs
  • Mechanisms to recognize {links} endpoints outside ObsCore

Latest version of DataLink can be found at:

The GitHub repository for issues and source can be found at:

Reference Interoperable Implementations

Server side

GAVO implemented the following changes = content_qualifier and local_semantics, service descriptor additional features such as DESCRIPTION? name, content_type, etc....

As a matter of example, a couple of links response for various obscore/sia/ssa tables in GAVO server

All these are combined with various semantics or content_type values


CADC has implemented the following in https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops:

  • INFO element with standardID in links response
  • new optional fields in links response: local_semantics (no content yet but can be populated with default vocab in most cases), content_qualifier (no content, not likely to use), link_auth, link_authorized
  • contentType param in service descriptors (where applicable)
IRIS image: https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/IRIS?f212h000/IRAS-25um shows link_auth=optional and link_authorized=true because one can authenticate but the data is public.

new CFHT data: anonymous use of https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/CFHT?2773629/2773629o shows link_auth=optional and link_authorized=false because the data is still proprietary and the caller is anonymous; if an authorized user makes the call they will see authorized=true. It's hard to demonstrate that for a general audience.

The core CADC implementation is available as a library (cadc-datalink-server) in MavenCentral with source code at https://github.com/opencadc/dal.git; the caom2-specific logic is available in a library (caom2-datalink-server) with source at https://github.com/opencadc/caom2service.git -- the core lib is also used in ALMA DataLink service but may not yet be released with the latest features.

Client side

Changed:
<
<
TopCAT prototype (http://andromeda.star.bristol.ac.uk/releases/topcat/pre/topcat-full_datalink11.jar) displays additional features in service descriptor and makes use of additional links table FIELD such as content_qualifier, authorization and local_semantics; The tool behavior is adapted to the content of these new FIELDS. For example Actions suggested by TopCat for the links not only depend on content_type but also from content_qualifier.
>
>
TOPCAT v4.8-8 and later displays additional features in service descriptor and makes use of additional links table FIELD such as content_qualifier, authorization and local_semantics; The tool behavior is adapted to the content of these new FIELDS. For example Activation Actions suggested by topcat for the links not only depend on content_type but also content_qualifier, and local_semantics is used to guess which link a user is interested in based on previous selections.
  AladinDesktop is going to adapt to those new features too. (see prototype screenshot)

The CADC DownloadManager (https://github.com/opencadc/apps.git) includes a simple DataLink client class so it can resolve publisherID values into 1..* URLs for download; this code hasn't changed as a result of DataLink -1.1. The CADC AdvancedSearch web portal makes calls to the above caom2ops/datalink service to find previews and download info for each row (publisherID): it makes use of link_authorized to decide to display the download options (or not), which prevents users from selecting downloads/links when they are not authorized and the request will be rejected later.

Implementations Validators

The following validators are available for DataLink

Changed:
<
<
>
>
  • datalinklint which is part of STILTS. From STILTS v3.4-8 there is a version parameter that can be set to 1.0 or 1.1 (effectively defaults to v1.1)
 



Comments from the IVOA Community during RFC/TCG review period: 2023-04-21 - 2023-05-18

The comments from the TCG members during the RFC/TCG review should be included in the next section.

In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your Wiki Name so that authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.

Additional discussion about any of the comments or responses can be conducted on the WG mailing list. However, please be sure to enter your initial comments here for full consideration in any future revisions of this document



Comments from TCG member during the RFC/TCG Review Period: 2023-04-21 - 2023-05-18

WG chairs or vice chairs must read the Document, provide comments if any (including on topics not directly linked to the Group matters) or indicate that they have no comment.

IG chairs or vice chairs are also encouraged to do the same, althought their inputs are not compulsory.

TCG Chair & Vice Chair

Applications Working Group

Data Access Layer Working Group

Data Model Working Group

Grid & Web Services Working Group

Registry Working Group

Semantics Working Group

Data Curation & Preservation Interest Group

Education Interest Group

Knowledge Discovery Interest Group

Operations Interest Group

Radio Astronomy Interest Group

Solar System Interest Group

Theory Interest Group

Time Domain Interest Group

Standards and Processes Committee


TCG Vote : 2023-05-19 - 2023-06-01

If you have minor comments (typos) on the last version of the document please indicate it in the Comments column of the table and post them in the TCG comments section above with the date.

Group Yes No Abstain Comments
TCG        
Apps        
DAL        
DM        
GWS        
Registry        
Semantics        
DCP        
Edu        
KDIG        
Ops        
Radio        
SSIG        
Theory        
TD        
<nop>StdProc        



<!--
* Set ALLOWTOPICRENAME = TWikiAdminGroup
-->

META FILEATTACHMENT attachment="AladinDesktopDataLink1-1.png" attr="" comment="" date="1681920946" name="AladinDesktopDataLink1-1.png" path="AladinDesktopDataLink1-1.png" size="1238442" user="FrancoisBonnarel" version="1"

Revision 52023-04-21 - JamesDempsey

 
META TOPICPARENT name="DataLink-1_0-Next"

DataLink 1.1 Proposed Recommendation: Request for Comments

Changed:
<
<
>
>

Introduction

 
Changed:
<
<
DataLink describes the linking of data discovery metadata to access
>
>
DataLink describes the linking of data discovery metadata to access to the data itself, further detailed metadata, related resources, and to services that perform operations on the data.
Deleted:
<
<
to the data itself, further detailed metadata, related resources, and to services that perform operations on the data.
  The main changes in v1.1 are
  • Generalize by adding use cases for links to content other than data files
  • VOSI-availability and VOSI-capabilities endpoints are now optional
  • Service descriptors can include exampleURL and contentType param(s), as well as DESCRIPTION, name, etc...
  • Added optional link_auth and link_authorized to signal whether authentication is necessary to use the link
  • INFO element with standardID mandatory in {links} response
  • Added content_qualifier FIELD to inform on the nature of the link target
  • Added local_semantics to identify similar links in the same DataLink service for different IDs
  • Mechanisms to recognize {links} endpoints outside ObsCore

Latest version of DataLink can be found at:

Added:
>
>
The GitHub repository for issues and source can be found at:
 

Reference Interoperable Implementations

Server side

GAVO implemented the following changes = content_qualifier and local_semantics, service descriptor additional features such as DESCRIPTION? name, content_type, etc....

As a matter of example, a couple of links response for various obscore/sia/ssa tables in GAVO server

All these are combined with various semantics or content_type values


CADC has implemented the following in https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops:

  • INFO element with standardID in links response
  • new optional fields in links response: local_semantics (no content yet but can be populated with default vocab in most cases), content_qualifier (no content, not likely to use), link_auth, link_authorized
  • contentType param in service descriptors (where applicable)
IRIS image: https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/IRIS?f212h000/IRAS-25um shows link_auth=optional and link_authorized=true because one can authenticate but the data is public.

new CFHT data: anonymous use of https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/CFHT?2773629/2773629o shows link_auth=optional and link_authorized=false because the data is still proprietary and the caller is anonymous; if an authorized user makes the call they will see authorized=true. It's hard to demonstrate that for a general audience.

The core CADC implementation is available as a library (cadc-datalink-server) in MavenCentral with source code at https://github.com/opencadc/dal.git; the caom2-specific logic is available in a library (caom2-datalink-server) with source at https://github.com/opencadc/caom2service.git -- the core lib is also used in ALMA DataLink service but may not yet be released with the latest features.

Client side

TopCAT prototype (http://andromeda.star.bristol.ac.uk/releases/topcat/pre/topcat-full_datalink11.jar) displays additional features in service descriptor and makes use of additional links table FIELD such as content_qualifier, authorization and local_semantics; The tool behavior is adapted to the content of these new FIELDS. For example Actions suggested by TopCat for the links not only depend on content_type but also from content_qualifier.

Changed:
<
<
AladinDesktop is going to adapt to those new features too. (see prototype screenshot)
>
>
AladinDesktop is going to adapt to those new features too. (see prototype screenshot)
  The CADC DownloadManager (https://github.com/opencadc/apps.git) includes a simple DataLink client class so it can resolve publisherID values into 1..* URLs for download; this code hasn't changed as a result of DataLink -1.1. The CADC AdvancedSearch web portal makes calls to the above caom2ops/datalink service to find previews and download info for each row (publisherID): it makes use of link_authorized to decide to display the download options (or not), which prevents users from selecting downloads/links when they are not authorized and the request will be rejected later.

Implementations Validators

The following validators are available for DataLink



Changed:
<
<

Comments from the IVOA Community during RFC/TCG review period: RFC_start_date - RFC_end_date

>
>

Comments from the IVOA Community during RFC/TCG review period: 2023-04-21 - 2023-05-18

  The comments from the TCG members during the RFC/TCG review should be included in the next section.

In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your Wiki Name so that authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.

Additional discussion about any of the comments or responses can be conducted on the WG mailing list. However, please be sure to enter your initial comments here for full consideration in any future revisions of this document



Changed:
<
<

Comments from TCG member during the RFC/TCG Review Period: TCG_start_date - TCG_end_date

>
>

Comments from TCG member during the RFC/TCG Review Period: 2023-04-21 - 2023-05-18

  WG chairs or vice chairs must read the Document, provide comments if any (including on topics not directly linked to the Group matters) or indicate that they have no comment.

IG chairs or vice chairs are also encouraged to do the same, althought their inputs are not compulsory.

TCG Chair & Vice Chair

Applications Working Group

Data Access Layer Working Group

Data Model Working Group

Grid & Web Services Working Group

Registry Working Group

Semantics Working Group

Data Curation & Preservation Interest Group

Education Interest Group

Knowledge Discovery Interest Group

Operations Interest Group

Radio Astronomy Interest Group

Solar System Interest Group

Theory Interest Group

Time Domain Interest Group

Standards and Processes Committee


Changed:
<
<

TCG Vote : Vote_start_date - Vote_end_date

>
>

TCG Vote : 2023-05-19 - 2023-06-01

  If you have minor comments (typos) on the last version of the document please indicate it in the Comments column of the table and post them in the TCG comments section above with the date.

Group Yes No Abstain Comments
TCG        
Apps        
DAL        
DM        
GWS        
Registry        
Semantics        
DCP        
Edu        
KDIG        
Ops        
Radio        
SSIG        
Theory        
TD        
Changed:
<
<
StdProc        
>
>
<nop>StdProc        
 
Changed:
<
<

>
>

<!--
* Set ALLOWTOPICRENAME = TWikiAdminGroup
-->
Deleted:
<
<
<--  
-->
 
META FILEATTACHMENT attachment="AladinDesktopDataLink1-1.png" attr="" comment="" date="1681920946" name="AladinDesktopDataLink1-1.png" path="AladinDesktopDataLink1-1.png" size="1238442" user="FrancoisBonnarel" version="1"

Revision 42023-04-20 - JamesDempsey

 
META TOPICPARENT name="DataLink-1_0-Next"

DataLink 1.1 Proposed Recommendation: Request for Comments

Changed:
<
<
>
>
 

DataLink describes the linking of data discovery metadata to access to the data itself, further detailed metadata, related resources, and to services that perform operations on the data.

The main changes in v1.1 are

  • Generalize by adding use cases for links to content other than data files
  • VOSI-availability and VOSI-capabilities endpoints are now optional
  • Service descriptors can include exampleURL and contentType param(s), as well as DESCRIPTION, name, etc...
  • Added optional link_auth and link_authorized to signal whether authentication is necessary to use the link
  • INFO element with standardID mandatory in {links} response
  • Added content_qualifier FIELD to inform on the nature of the link target
  • Added local_semantics to identify similar links in the same DataLink service for different IDs
  • Mechanisms to recognize {links} endpoints outside ObsCore
Deleted:
<
<
 Latest version of DataLink can be found at:

Reference Interoperable Implementations

Server side

GAVO implemented the following changes = content_qualifier and local_semantics, service descriptor additional features such as DESCRIPTION? name, content_type, etc....

As a matter of example, a couple of links response for various obscore/sia/ssa tables in GAVO server

All these are combined with various semantics or content_type values


CADC has implemented the following in https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops:

  • INFO element with standardID in links response
  • new optional fields in links response: local_semantics (no content yet but can be populated with default vocab in most cases), content_qualifier (no content, not likely to use), link_auth, link_authorized
  • contentType param in service descriptors (where applicable)
IRIS image: https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/IRIS?f212h000/IRAS-25um shows link_auth=optional and link_authorized=true because one can authenticate but the data is public.

new CFHT data: anonymous use of https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/CFHT?2773629/2773629o shows link_auth=optional and link_authorized=false because the data is still proprietary and the caller is anonymous; if an authorized user makes the call they will see authorized=true. It's hard to demonstrate that for a general audience.

The core CADC implementation is available as a library (cadc-datalink-server) in MavenCentral with source code at https://github.com/opencadc/dal.git; the caom2-specific logic is available in a library (caom2-datalink-server) with source at https://github.com/opencadc/caom2service.git -- the core lib is also used in ALMA DataLink service but may not yet be released with the latest features.

Client side

TopCAT prototype (http://andromeda.star.bristol.ac.uk/releases/topcat/pre/topcat-full_datalink11.jar) displays additional features in service descriptor and makes use of additional links table FIELD such as content_qualifier, authorization and local_semantics; The tool behavior is adapted to the content of these new FIELDS. For example Actions suggested by TopCat for the links not only depend on content_type but also from content_qualifier.

AladinDesktop is going to adapt to those new features too. (see prototype screenshot)

Changed:
<
<
The CADC DownloadManager (https://github.com/opencadc/apps.git) includes a simple DataLink client class so it can resolve publisherID values into 1..* URLs for download; this code hasn't changed as a result of DataLink-1.1. The CADC AdvancedSearch web portal makes calls to the above caom2ops/datalink service to find previews and download info for each row (publisherID): it makes use of link_authorized to decide to display the download options (or not), which prevents users from selecting downloads/links when they are not authorized and the request will be rejected later.
>
>
The CADC DownloadManager (https://github.com/opencadc/apps.git) includes a simple DataLink client class so it can resolve publisherID values into 1..* URLs for download; this code hasn't changed as a result of DataLink -1.1. The CADC AdvancedSearch web portal makes calls to the above caom2ops/datalink service to find previews and download info for each row (publisherID): it makes use of link_authorized to decide to display the download options (or not), which prevents users from selecting downloads/links when they are not authorized and the request will be rejected later.
 

Implementations Validators

The following validators are available for DataLink



Comments from the IVOA Community during RFC/TCG review period: RFC_start_date - RFC_end_date

The comments from the TCG members during the RFC/TCG review should be included in the next section.

In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your Wiki Name so that authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.

Additional discussion about any of the comments or responses can be conducted on the WG mailing list. However, please be sure to enter your initial comments here for full consideration in any future revisions of this document



Comments from TCG member during the RFC/TCG Review Period: TCG_start_date - TCG_end_date

WG chairs or vice chairs must read the Document, provide comments if any (including on topics not directly linked to the Group matters) or indicate that they have no comment.

IG chairs or vice chairs are also encouraged to do the same, althought their inputs are not compulsory.

TCG Chair & Vice Chair

Applications Working Group

Data Access Layer Working Group

Data Model Working Group

Grid & Web Services Working Group

Registry Working Group

Semantics Working Group

Data Curation & Preservation Interest Group

Education Interest Group

Knowledge Discovery Interest Group

Operations Interest Group

Radio Astronomy Interest Group

Solar System Interest Group

Theory Interest Group

Time Domain Interest Group

Standards and Processes Committee


TCG Vote : Vote_start_date - Vote_end_date

If you have minor comments (typos) on the last version of the document please indicate it in the Comments column of the table and post them in the TCG comments section above with the date.

Group Yes No Abstain Comments
TCG        
Apps        
DAL        
DM        
GWS        
Registry        
Semantics        
DCP        
Edu        
KDIG        
Ops        
Radio        
SSIG        
Theory        
TD        
StdProc        



<--  
-->

META FILEATTACHMENT attachment="AladinDesktopDataLink1-1.png" attr="" comment="" date="1681920946" name="AladinDesktopDataLink1-1.png" path="AladinDesktopDataLink1-1.png" size="1238442" user="FrancoisBonnarel" version="1"

Revision 32023-04-19 - FrancoisBonnarel

 
META TOPICPARENT name="DataLink-1_0-Next"

DataLink 1.1 Proposed Recommendation: Request for Comments


DataLink describes the linking of data discovery metadata to access to the data itself, further detailed metadata, related resources, and to services that perform operations on the data.

The main changes in v1.1 are

  • Generalize by adding use cases for links to content other than data files
  • VOSI-availability and VOSI-capabilities endpoints are now optional
Changed:
<
<
  • Service descriptors can include exampleURL and contentType param(s)
>
>
  • Service descriptors can include exampleURL and contentType param(s), as well as DESCRIPTION, name, etc...
 
  • Added optional link_auth and link_authorized to signal whether authentication is necessary to use the link
  • INFO element with standardID mandatory in {links} response
Added:
>
>
  • Added content_qualifier FIELD to inform on the nature of the link target
  • Added local_semantics to identify similar links in the same DataLink service for different IDs
  • Mechanisms to recognize {links} endpoints outside ObsCore
 
Added:
>
>
 Latest version of DataLink can be found at:

Reference Interoperable Implementations

Server side

GAVO implemented the following changes = content_qualifier and local_semantics, service descriptor additional features such as DESCRIPTION? name, content_type, etc....

As a matter of example, a couple of links response for various obscore/sia/ssa tables in GAVO server

All these are combined with various semantics or content_type values


CADC has implemented the following in https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops:

  • INFO element with standardID in links response
  • new optional fields in links response: local_semantics (no content yet but can be populated with default vocab in most cases), content_qualifier (no content, not likely to use), link_auth, link_authorized
  • contentType param in service descriptors (where applicable)
IRIS image: https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/IRIS?f212h000/IRAS-25um shows link_auth=optional and link_authorized=true because one can authenticate but the data is public.

new CFHT data: anonymous use of https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/CFHT?2773629/2773629o shows link_auth=optional and link_authorized=false because the data is still proprietary and the caller is anonymous; if an authorized user makes the call they will see authorized=true. It's hard to demonstrate that for a general audience.

The core CADC implementation is available as a library (cadc-datalink-server) in MavenCentral with source code at https://github.com/opencadc/dal.git; the caom2-specific logic is available in a library (caom2-datalink-server) with source at https://github.com/opencadc/caom2service.git -- the core lib is also used in ALMA DataLink service but may not yet be released with the latest features.

Client side

TopCAT prototype (http://andromeda.star.bristol.ac.uk/releases/topcat/pre/topcat-full_datalink11.jar) displays additional features in service descriptor and makes use of additional links table FIELD such as content_qualifier, authorization and local_semantics; The tool behavior is adapted to the content of these new FIELDS. For example Actions suggested by TopCat for the links not only depend on content_type but also from content_qualifier.

Changed:
<
<
AladinDesktop is going to adapt to those new features too.
>
>
AladinDesktop is going to adapt to those new features too. (see prototype screenshot)
  The CADC DownloadManager (https://github.com/opencadc/apps.git) includes a simple DataLink client class so it can resolve publisherID values into 1..* URLs for download; this code hasn't changed as a result of DataLink-1.1. The CADC AdvancedSearch web portal makes calls to the above caom2ops/datalink service to find previews and download info for each row (publisherID): it makes use of link_authorized to decide to display the download options (or not), which prevents users from selecting downloads/links when they are not authorized and the request will be rejected later.

Implementations Validators

The following validators are available for DataLink



Comments from the IVOA Community during RFC/TCG review period: RFC_start_date - RFC_end_date

The comments from the TCG members during the RFC/TCG review should be included in the next section.

In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your Wiki Name so that authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.

Additional discussion about any of the comments or responses can be conducted on the WG mailing list. However, please be sure to enter your initial comments here for full consideration in any future revisions of this document



Comments from TCG member during the RFC/TCG Review Period: TCG_start_date - TCG_end_date

WG chairs or vice chairs must read the Document, provide comments if any (including on topics not directly linked to the Group matters) or indicate that they have no comment.

IG chairs or vice chairs are also encouraged to do the same, althought their inputs are not compulsory.

TCG Chair & Vice Chair

Applications Working Group

Data Access Layer Working Group

Data Model Working Group

Grid & Web Services Working Group

Registry Working Group

Semantics Working Group

Data Curation & Preservation Interest Group

Education Interest Group

Knowledge Discovery Interest Group

Operations Interest Group

Radio Astronomy Interest Group

Solar System Interest Group

Theory Interest Group

Time Domain Interest Group

Standards and Processes Committee


TCG Vote : Vote_start_date - Vote_end_date

If you have minor comments (typos) on the last version of the document please indicate it in the Comments column of the table and post them in the TCG comments section above with the date.

Group Yes No Abstain Comments
TCG        
Apps        
DAL        
DM        
GWS        
Registry        
Semantics        
DCP        
Edu        
KDIG        
Ops        
Radio        
SSIG        
Theory        
TD        
StdProc        



<--  
-->
Added:
>
>
META FILEATTACHMENT attachment="AladinDesktopDataLink1-1.png" attr="" comment="" date="1681920946" name="AladinDesktopDataLink1-1.png" path="AladinDesktopDataLink1-1.png" size="1238442" user="FrancoisBonnarel" version="1"

Revision 22023-04-14 - PatrickDowler

 
META TOPICPARENT name="DataLink-1_0-Next"

DataLink 1.1 Proposed Recommendation: Request for Comments


DataLink describes the linking of data discovery metadata to access to the data itself, further detailed metadata, related resources, and to services that perform operations on the data.

The main changes in v1.1 are

  • Generalize by adding use cases for links to content other than data files
  • VOSI-availability and VOSI-capabilities endpoints are now optional
  • Service descriptors can include exampleURL and contentType param(s)
  • Added optional link_auth and link_authorized to signal whether authentication is necessary to use the link
  • INFO element with standardID mandatory in {links} response

Latest version of DataLink can be found at:

Reference Interoperable Implementations

Server side

GAVO implemented the following changes = content_qualifier and local_semantics, service descriptor additional features such as DESCRIPTION? name, content_type, etc....

As a matter of example, a couple of links response for various obscore/sia/ssa tables in GAVO server

All these are combined with various semantics or content_type values

Added:
>
>

 
Added:
>
>
CADC has implemented the following in https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops:
  • INFO element with standardID in links response
  • new optional fields in links response: local_semantics (no content yet but can be populated with default vocab in most cases), content_qualifier (no content, not likely to use), link_auth, link_authorized
  • contentType param in service descriptors (where applicable)
IRIS image: https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/IRIS?f212h000/IRAS-25um shows link_auth=optional and link_authorized=true because one can authenticate but the data is public.

new CFHT data: anonymous use of https://ws.cadc-ccda.hia-iha.nrc-cnrc.gc.ca/caom2ops/datalink?ID=ivo://cadc.nrc.ca/CFHT?2773629/2773629o shows link_auth=optional and link_authorized=false because the data is still proprietary and the caller is anonymous; if an authorized user makes the call they will see authorized=true. It's hard to demonstrate that for a general audience.

The core CADC implementation is available as a library (cadc-datalink-server) in MavenCentral with source code at https://github.com/opencadc/dal.git; the caom2-specific logic is available in a library (caom2-datalink-server) with source at https://github.com/opencadc/caom2service.git -- the core lib is also used in ALMA DataLink service but may not yet be released with the latest features.

 

Client side

TopCAT prototype (http://andromeda.star.bristol.ac.uk/releases/topcat/pre/topcat-full_datalink11.jar) displays additional features in service descriptor and makes use of additional links table FIELD such as content_qualifier, authorization and local_semantics; The tool behavior is adapted to the content of these new FIELDS. For example Actions suggested by TopCat for the links not only depend on content_type but also from content_qualifier.

AladinDesktop is going to adapt to those new features too.

Added:
>
>
The CADC DownloadManager (https://github.com/opencadc/apps.git) includes a simple DataLink client class so it can resolve publisherID values into 1..* URLs for download; this code hasn't changed as a result of DataLink-1.1. The CADC AdvancedSearch web portal makes calls to the above caom2ops/datalink service to find previews and download info for each row (publisherID): it makes use of link_authorized to decide to display the download options (or not), which prevents users from selecting downloads/links when they are not authorized and the request will be rejected later.
 

Implementations Validators

The following validators are available for DataLink



Comments from the IVOA Community during RFC/TCG review period: RFC_start_date - RFC_end_date

The comments from the TCG members during the RFC/TCG review should be included in the next section.

In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your Wiki Name so that authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.

Additional discussion about any of the comments or responses can be conducted on the WG mailing list. However, please be sure to enter your initial comments here for full consideration in any future revisions of this document



Comments from TCG member during the RFC/TCG Review Period: TCG_start_date - TCG_end_date

WG chairs or vice chairs must read the Document, provide comments if any (including on topics not directly linked to the Group matters) or indicate that they have no comment.

IG chairs or vice chairs are also encouraged to do the same, althought their inputs are not compulsory.

TCG Chair & Vice Chair

Applications Working Group

Data Access Layer Working Group

Data Model Working Group

Grid & Web Services Working Group

Registry Working Group

Semantics Working Group

Data Curation & Preservation Interest Group

Education Interest Group

Knowledge Discovery Interest Group

Operations Interest Group

Radio Astronomy Interest Group

Solar System Interest Group

Theory Interest Group

Time Domain Interest Group

Standards and Processes Committee


TCG Vote : Vote_start_date - Vote_end_date

If you have minor comments (typos) on the last version of the document please indicate it in the Comments column of the table and post them in the TCG comments section above with the date.

Group Yes No Abstain Comments
TCG        
Apps        
DAL        
DM        
GWS        
Registry        
Semantics        
DCP        
Edu        
KDIG        
Ops        
Radio        
SSIG        
Theory        
TD        
StdProc        



<--  
-->
Deleted:
<
<

Revision 12023-04-14 - JamesDempsey

 
META TOPICPARENT name="DataLink-1_0-Next"

DataLink 1.1 Proposed Recommendation: Request for Comments


DataLink describes the linking of data discovery metadata to access to the data itself, further detailed metadata, related resources, and to services that perform operations on the data.

The main changes in v1.1 are

  • Generalize by adding use cases for links to content other than data files
  • VOSI-availability and VOSI-capabilities endpoints are now optional
  • Service descriptors can include exampleURL and contentType param(s)
  • Added optional link_auth and link_authorized to signal whether authentication is necessary to use the link
  • INFO element with standardID mandatory in {links} response

Latest version of DataLink can be found at:

Reference Interoperable Implementations

Server side

GAVO implemented the following changes = content_qualifier and local_semantics, service descriptor additional features such as DESCRIPTION? name, content_type, etc....

As a matter of example, a couple of links response for various obscore/sia/ssa tables in GAVO server

All these are combined with various semantics or content_type values

Client side

TopCAT prototype (http://andromeda.star.bristol.ac.uk/releases/topcat/pre/topcat-full_datalink11.jar) displays additional features in service descriptor and makes use of additional links table FIELD such as content_qualifier, authorization and local_semantics; The tool behavior is adapted to the content of these new FIELDS. For example Actions suggested by TopCat for the links not only depend on content_type but also from content_qualifier.

AladinDesktop is going to adapt to those new features too.

Implementations Validators

The following validators are available for DataLink



Comments from the IVOA Community during RFC/TCG review period: RFC_start_date - RFC_end_date

The comments from the TCG members during the RFC/TCG review should be included in the next section.

In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your Wiki Name so that authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.

Additional discussion about any of the comments or responses can be conducted on the WG mailing list. However, please be sure to enter your initial comments here for full consideration in any future revisions of this document



Comments from TCG member during the RFC/TCG Review Period: TCG_start_date - TCG_end_date

WG chairs or vice chairs must read the Document, provide comments if any (including on topics not directly linked to the Group matters) or indicate that they have no comment.

IG chairs or vice chairs are also encouraged to do the same, althought their inputs are not compulsory.

TCG Chair & Vice Chair

Applications Working Group

Data Access Layer Working Group

Data Model Working Group

Grid & Web Services Working Group

Registry Working Group

Semantics Working Group

Data Curation & Preservation Interest Group

Education Interest Group

Knowledge Discovery Interest Group

Operations Interest Group

Radio Astronomy Interest Group

Solar System Interest Group

Theory Interest Group

Time Domain Interest Group

Standards and Processes Committee


TCG Vote : Vote_start_date - Vote_end_date

If you have minor comments (typos) on the last version of the document please indicate it in the Comments column of the table and post them in the TCG comments section above with the date.

Group Yes No Abstain Comments
TCG        
Apps        
DAL        
DM        
GWS        
Registry        
Semantics        
DCP        
Edu        
KDIG        
Ops        
Radio        
SSIG        
Theory        
TD        
StdProc        



<--  
-->
 
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback