This topic collects proposals for modifications of the DataLink-1.0 specification in order to improve the next revision of the specification.
Errata to the DataLink-1.0 recommendation can be found on the devoted DataLink-1.0 Errata page.
The following are acknowledged mistakes in DataLink-1.0. Errata could be pushed through, or they could just get fixed at the next version.
These points were (mostly) taken from a presentation in Victoria 2018, written up here as requested by the DAL chair. They can be taken into account when preparing a subsequent version of the standard, though not all of them necessarily should lead to changes.
They have to be treated differently in the different cases (neither is a special case of the other), since in the standalone case the service descriptor applies equally to all rows, while in the {links}-response case it only applies to those rows from which it is explicitly referenced. This makes it somewhat complicated to handle them since you need to determine the context first, but there's probably nothing that can be done about that while maintaining backward compatibility. However, it would be useful to spell out this distinction in the document; it took me quite a while to work it out. See mailing list.
name
attribute and DESCRIPTION
child of the service descriptor RESOURCE
element. That is permitted given the VOTable schema, but not mentioned in this document. I suggest to include such usages in the examples given here, and to encourage service descriptors to add these items where appropriate. I further suggest adding an (optional) name="contentType"
PARAM alongside the existing ones in table 3 to supply MIME type where known.
semantics
and description
columns, but it's a bit haphazard. I suggest a new (optional?) column named something like link_code
that can be assessed for equality in order to identify corresponding rows.
-- MarkTaylor - 2018-06-13
Suggestion for revision of DataLink -1.0, in terms of new features.
Notes by MarcoMolinaro and FrancoisBonnarel from a splinter session held during Paris Interop meeting on May the 16th, 5-30:7 PM
Around 15 IVOA partners discussed DataLink evolution proposals Among those people were Pat Dowler, Markus Demleitner, Laurent Michel, Mark Taylor, Tom McGlynn, Alberto Micol, Marco Molinaro, Anais Oberto, Gregory Mantelet, François Bonnarel
Sorry for people we forgot. Please add your name above.
The starting points were the feedback discussions we had during the last years in the DAL working group.
The main issues have been summarized in this IVOA note : http://www.ivoa.net/documents/Notes/RecentDALProtocolsFeedback/index.html
A proposal for changes has been presented at College Park : https://wiki.ivoa.net/internal/IVOA/InterOpNov2018DAL/DataLink-next.pdf
An attempt for a new draft is now available here
These are the changes which have been discussed (the items numbers are those used in the College Park presentation):
1 and 2) - Extension of the scope of DataLink {links} response to items which are different from datasets discovered in whatever way.
This point comes from data providers willing to use DataLink for attaching datasets or additional information to sources in catalogues or other items in service responses. This new usage has to be reflected in introduction and use cases.
The discussion on this has been moved to two github/ivoa-std/DataLink issues : https://github.com/ivoa-std/DataLink/issues/6 and https://github.com/ivoa-std/DataLink/issues/7
3 ) - The extension of the scope makes the linkage to {links} response occur in contexts not planned by original spec. Beside the acces.format/access.reference couple of FIELDS/PARAMs which can be used in ObsCore -ike contexts, the only previous proposed generic solution to address this response in a VOTable was to use a service descriptor RESOURCE to define the url to the {links} service with a reference for the ID param. there is a proposal to also use the LINK element inside a FIELD with a new content-type = ""application/x-votable+xml;content=datalink" where the FIELD directly contains the url to the {links} endpoint/. A new section for that seems too much. This will come in an appendix.
The discussion on this has been moved to the following github/ivoa-std/DataLink issue: https://github.com/ivoa-std/DataLink/issues/31
4 ) - The dataLink {links} response can be discovered and used outside a service query. IT can be useful to recognize its nature of {links} response by an INFO tag.
The discussion on this has been moved to the following github/ivoa-std/DataLink issue: https://github.com/ivoa-std/DataLink/issues/17
5 ) - Allowing fragments in the access_url seems to be a sensible thing to do considering multi-extension FITS, tar files, HDF5 and other structured data available. Issues to be solved are however related to providing the client enough information to consume this solution. Prototyping on direct use cases could help. It is questionable if The links response is used to get one raw with a specific semantics, description for each subpart but only retrievable or if each subpart can be retrieved via an extension of SODA. (See SODA 1.1)
The discussion on this has been moved to the following github/ivoa-std/DataLink issue: https://github.com/ivoa-std/DataLink/issues/15
6 ) - The "description" column in the {links} response needs to be a SHOULD to properly label the various links made available, specially when they share the same semantics. Pretty useful for the end user of a {links} response table.
The discussion on this has been moved to the following github/ivoa-std/DataLink issue: https://github.com/ivoa-std/DataLink/issues/16
7 ) - There is an obvious need for new vocabulary. See: https://wiki.ivoa.net/twiki/bin/view/IVOA/UpdateDatalinkTerms But the semantics/vocabulary discussion is detached from the DataLink specification revision. I.e. it's fine to discuss it, but not within the scope of the document revision.
The discussion occured actually in the following threads
http://mail.ivoa.net/pipermail/dal/2019-October/008191.html
http://mail.ivoa.net/pipermail/dal/2019-October/008200.html
http://mail.ivoa.net/pipermail/dal/2019-October/008202.html
8 ) - In order to connect resource table to resource service descriptor, Mark Taylor proposed to adopt a nested resource schema. Considering we're currently mainly in the situation of one table response per query, it doesn't seem critical at this stage, but it needs more discussion and testing. Tom Mac Glynn proposed another solution which not well catched by the editor. Please Tom can you add your proposal here.
The discussion on this has been moved to the following github/ivoa-std/DataLink issue: https://github.com/ivoa-std/DataLink/issues/20
9 ) - We propose to add a free-text name of the service descriptor resource to help identify the offered services. With a SHOULD requirement.
The discussion on this has been moved to the following github/ivoa-std/DataLink issue: https://github.com/ivoa-std/DataLink/issues/21
10 ) - We propose to Add an optional "content-type" resource descriptor PARAMETER to identify the expected media type of the offered linked dataset/resource. This can also be considered as a SODA-1.1 new input parameter for driving format conversion.
The discussion on this has been moved to the following github/ivoa-std/DataLink issue: https://github.com/ivoa-std/DataLink/issues/22
11 ) - It COULD be useful to provide a human readable description of a service descriptor. This wil be done by using a
The discussion on this has been moved to the following github/ivoa-std/DataLink issue: https://github.com/ivoa-std/DataLink/issues/23
12 ) - A self describing service provides a service descriptor when queried with no input parameter. If queried with the only single identifier PARAMETER the provided service descriptor restrict parameter ranges (MIN/MAX) or OPTIONS to values adapted to the queried dataset or item.
The discussion on this has been moved to the following github/ivoa-std/DataLink issue: https://github.com/ivoa-std/DataLink/issues/25
13 ) - ReST Interface descriptors. Could be useful for VOSPACE or any URL with variable sections/ It may be better to refer the existing Recommendation (https://tools.ietf.org/html/rfc6570 ) discussing this than to reinvent a ReST descriptor on our own. Actual implementation may be postponed to when use cases/prototypes are made available.
The discussion on this has been moved to the following github/ivoa-std/DataLink issue: https://github.com/ivoa-std/DataLink/issues/27
14 ) - DataLink recognition outside a response from a protocol. Some discussion on the new proposed solution from the Note (new content-type=" in the LINK element), especially when the identification of this link column (sort-of) replicates the content that can be provided by a proper service descriptor.
The discussion on this has been moved to the following github/ivoa-std/DataLink issue: https://github.com/ivoa-std/DataLink/issues/29
Extra#0 : Should we suppress the availabilty endpoint in Section 2 (which according to Markus nether works)? Should we focus on {links} capabilities attached to other services only (and not as standalone DataLink services). This point is still controversy.
The discussion on this has been moved to the two github/ivoa-std/DataLink issues: https://github.com/ivoa-std/DataLink/issues/13 and https://github.com/ivoa-std/DataLink/issues/14
Extra#1 : from A. Micol. Addition of a "category" column to identify diffrent offered datasets. Isn't that tackled by new semantics terms ? Reluctance to add too much columns belonging to other protocols (Obscore data_product_type). Alberto should add his proposal to the -Next page. Discussion to follow.
The discussion actually occured in the following threads
http://mail.ivoa.net/pipermail/dal/2019-October/008191.html
http://mail.ivoa.net/pipermail/dal/2019-October/008200.html
http://mail.ivoa.net/pipermail/dal/2019-October/008202.html
Extra#2 : use case for an additional boolean column to quickly identify link elements that require authorization (see below, PatrickDowler) .
The discussion on this has been moved to the following github/ivoa-std/DataLink issue: https://github.com/ivoa-std/DataLink/issues/33
-- FrancoisBonnarel - 2019-07-20
We could add a non-normative advice to include a column in the {links} resource named "readable" with values true|false. The values predict if the client will (using the current anon or authenticated identity) be allowed to use the link (eg download the data). This saves the clients the annoyance of trying and getting a 403 Permission Denied.
-- PatrickDowler - 2019-05-16
_As I see it, the things we are discussing concerning Datalink fall into 4 independent levels or categories: Level 0 - Data-format (fits, VOTable, PDF, png, …) Level 1 - Data-type (tabular, image, spectrum, cube, text, …) Level 2 - Data-information (Documentation, Calibration, Log, Preview, …) Level 3 - Data-relation (Derived from, Progenitor of, Sibling of, ...)
I see these as orthogonal levels since a list of links can be of any type (level 1) with any kind of format (level 0), any kind of relation (level 3) and could have any type of associated information to describe it (level 2).
Today the list of links returned by datalink is described in the columns content-type and semantics. These two columns cover the above levels only up to some degree.
Content-type: covers level 0 mainly, with some exceptions such as VOTable (which is also level 1). Semantics: covers level 2 mainly (e.g. preview), but also level 3 (e.g. derivation, progenitor).
Datalink at the moment has no field properly covering level 1 and applications (—> users) would benefit from having that well covered.
So, in my opinion, if I had to redo Datalink I would keep these different levels separated instead of putting everything into the semantics field. But applications might have a different point of view here —> Shouldn't we add Apps to this discussion?
Timeseries would be in level 3, since it is a relation. And I don’t think we would need the use of sibling or progenitor or anything like that for timeseries. What we need is to be able to say is:
"This list of links are time-series of tabular type"
"This list of links are time-series of spectrum type"
…
But if were to add terms such as sibling and so on, there is already an IVOA relationship vocabulary: http://ivoa.net/rdf/voresource/relationship_type/2016-08-17/relationship_type.html_
" If the client submits more ID values than a service is prepared to process, the service should process ID values up to the limit and must include an overflow indicator in the output as described in DALI. The service must not truncate the output within the set of rows (links) for a single ID value if the request exceeds such an input limit."
Control by MAXREC or by telling the client there is an overflow ?
None of these is in 1.0. Neither in 1.1 at the moment Mark Taylor seems to be OK for QUERY_STAUS OVERFLOW Alberto and ESO have another solution for tuning the number of output lines
Thoughts?
-- MarkusDemleitner - 2020-05-13 answered
" If the client submits more ID values than a service is prepared to process, the service should process ID values up to the limit and must include an overflow indicator in the output as described in DALI. The service must not truncate the output within the set of rows (links) for a single ID value if the request exceeds such an input limit."
For the record, that's not my text, that's Datalink REC-1.0, p. 10. The context is this thread on the DAL list: http://mail.ivoa.net/pipermail/dal/2020-March/008318.html.
Control by MAXREC or by telling the client there is an overflow ? ```xml
Thoughts ?
I think I agree with Mark's assessment in the cited thread: QUERY_STATUS=OK and QUERY_STATUS=ERROR aren't useful in Datalink, and hence we shouldn't put them in. Also, it doesn't seem there's much of a place for MAXREC in Datalink.
Hence, I think there's no immediate need for changes in the spec text, let alone the spec content from this issue.
One might argue that writing something like:
No QUERY_STATUS INFOs with values other than OVERFLOW should be produced by datalink services.
That's probably benign, since we can't change the overflow indication in DALI anyway when it is directly referenced by implemented standards, and thus we can hard-code QUERY_STATUS here, too. It would perhaps have saved me a bit of bafflement. On the other hand: has this ever baffled anyone else? And so badly as to justify more spec text?
Similarly, perhaps it is worth saying somewhere that DALI MAXREC doesn't apply to Datalink, but I couldn't say where that text would fit without seeming odd itself.
So... I think my vote would be for closing this issue without action.
• added optional content_qualifier to describe link target content with terms from the product-type vocabulary
• added optional link_auth and link_authorized to signal whether au- thentication is necessary to use the link
• clarified use of multiple ID values and possible OVERFLOW
• clarified use of utype for self-describing service descriptors
• clarified use of semantics
• generalize by adding use cases for links to content other than data files
• added using LINK to convey when datalink request URL is in a table column
• service descriptors can include a contentType param to describe service output and should include a name and description
• service descriptors can include exampleURL param(s) with working example and description
• VOSI-availability and VOSI-capabilities endpoints are now optional
* List of changes of version 1.1 with respect to version 1.0 (2015) and corresponding DataLink issues on github repository.
see : https://github.com/ivoa-std/DataLink for details
-- FrancoisBonnarel - 2023-01-17
List of changes following the RFC period and TCG vote
-- FrancoisBonnarel - 2024-03-14
* discussed, but not integrated in 1.1 yet :
* Implementation : server side.
GAVO implemented the following changes = content_qualifier and local_semantics, service descriptor additional features such as DESCRIPTION? name, content_type, etc....
As a matter of example, a couple of links response for various obscore/sia/ssa tables in GAVO server
rosat.images : http://dc.zah.uni-heidelberg.de/rosat/q/dl/dlmeta?ID=ivo%3a%2f%2forg.gavo.dc%2f~%3frosat%2fimage_data%2frda_4%2fwg400138p_n1_p1_r2_f2_p1%2frp400138n00_im3.fits.gz where some links have content_qualifier = #image, others have content_qualifier = #event or #timeseries
lamost6.lrs : http://dc.zah.uni-heidelberg.de/lamost6/q/sdl_lrs/dlmeta?ID=ivo%3a%2f%2forg.gavo.dc%2f~%3flamost6%2flrs%2f373006237 where some links have content_qualifier = #spectrum
califadr3.cubes : http://dc.zah.uni-heidelberg.de/califa/q3/dl/dlget?ID=ivo%3A//org.gavo.dc/~%3Fcalifa/datadr3/COMB/ARP220.COMB.rscube.fits where some links have content_qualifier = #cube
ppakm31.maps : http://dc.zah.uni-heidelberg.de/ppakm31/q/cdl/dlmeta?ID=ivo%3a%2f%2forg.gavo.dc%2f~%3fppakm31%2fdata%2fPPAK_M31_F5_cube.fits wher some links have local_semantics filled up and content_qualifier equals #image or #cube
All these are combined with various semantics or content_type values
* implementation : client side
TopCAT prototype (http://andromeda.star.bristol.ac.uk/releases/topcat/pre/topcat-full_datalink11.jar) displays additional features in service descriptor and makes use of additional links table FIELD such as content_qualifier, authorization and local_semantics; The tool behavior is adapted to the content of these new FIELDS. For example Actions suggested by TopCat for the links not only depend on content_type but also from content_qualifier.
AladinDesktop is going to adapt to those new features too.
-- FrancoisBonnarel - 2023-01-06
I | Attachment | History | Action | Size | Date | Who | Comment |
---|---|---|---|---|---|---|---|
![]() |
DataLink-20200505.pdf | r1 | manage | 382.0 K | 2020-05-14 - 16:58 | FrancoisBonnarel | DataLink 1.1 version For virtual interop May 2020 |
![]() |
DataLink.pdf | r1 | manage | 402.2 K | 2019-07-22 - 06:43 | FrancoisBonnarel | DataLink 1.1 preliminary internal working draft |
IVOA.net
Wiki Home
WebChanges
WebTopicList
WebStatistics
Twiki Meta & Help
IVOA
Know
Main
Sandbox
TWiki
Working Groups
Interest Groups
Committees