IVOA Identifiers Version 2 Proposed Recommendation: Request for Comments
IVOA Identifiers describes the syntax and semantics of the IVOA's special URIs. Such URIs are being used in the Registry to identify records, but IVOIDs are also used to denote datasets of all kinds, reference standards, and more.
The latest version of Identifiers 2 can be found at:
See also
Examples of the validator.
SVN revision references in the text refer to the repository at
https://volute.g-vo.org/svn/trunk/projects/registry/Identifiers, where you'll also find bleeding edge versions not yet published in the repository.
Reference Interoperable Implementations
Doesn't really apply here, as no protocol is defined. However, IVOA identifers are in daily use in the Registry, and we preserve the Version 1 properties that make that work.
The special form of Dataset ids has been in use in the GAVO data center for a while. We will try to have other data centers on board for TCG review and provide a reference service implementing the recommended procedure for resolving these things.
The special form of standard ids has, essentially, been used for VOSI URLs and
RegTAP. TAP 1.1 will probably be the first standard to fully use the patterns proposed here.
Implementations Validators
There is a validator for IVOIDs at
http://dc.g-vo.org/ivoidval/q/val/form
As a somewhat-validator for
PubDIDs -- it checks whether a
PubDID is globally resolvable according to the recipe in the document --, there's also
http://dc.g-vo.org/ivoidval/q/didresolve/form.
RFC Review Period: 2015-07-22 - 2015-09-01
TCG Review Period: 2015-10-12 - 2015-11-20
Comments from the IVOA Community during RFC period: 2015-07-22 - 2015-09-01
In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your Wiki Name so that authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.
Additional discussion about any of the comments or responses can be conducted on the WG mailing list. However, please be sure to enter your initial comments here for full consideration in any future revisions of this document
Basically clear, it's good to have a detailed definition of Identifiers for reference. However I have some comments:
- Sec 2: The term "resource key" is used in several places starting in sec 2.1 before it is defined, even informally. A note of what it means should be added early in sec 2.1; maybe add iut as an annotation on the graphic that already illustrates "IVORN" and "local part".
- Text change in Rev. 3068 -- MD
- Sec 2.1: The term "IVORN" is introduced here to mean the
ivo://<authority>/<path>
part of the Identifier. I'm not sure if a formal definition of IVORN has ever existed up till now, the term didn't appear in the previous version (1.12) of the Identifiers standard, and certainly I wouldn't have been able to tell you the difference between an IVORN and an ivo-id. I do think the definition proposed here is useful, but historically that term doesn't seem to have been used consistently in that sense. A couple of examples of historical usage contrary to the current definition are "The IVORN for the only language mandatory for TAP services, ADQL 2.0, is ivo://ivoa.net/std/ADQL#v2.0
" (TAPRegExt sec 2.3), and "... declaring support for the data model Registry 1.0 with the IVORN ivo://ivoa.net/std/RegTAP#1.0
" (RegTAP sec 7); in both these cases the strings are not IVORNs in the proposed sense since they contain fragments. There are probably other examples out there. So, I raise the question of whether we can redefine this term now in the face of contrary historical usage. If we do (which I think would be OK), there should at least be a note in this document to explain how previous contrary usage should be treated - either say it's just wrong, or say that it used to have a more sloppy definition but when used with explicit reference to this document it has the more specific meaning.
- Thanks for doing my homework. I finally could be bothered to actually ask a search engine for all existing uses of IVORN on ivoa.net, and it turned out there's a large body of usage for that term in the VOEvent community. Of course, that usage conflicts with usage in Registry. So, it's a mess, and as IVORN really hasn't been a good term in the first place and the W3C has deprecated URN, I'm now proposing to deprecate IVORN and say "registry reference" or "registry part" instead. Rev. 3069 -- MD
- Sec 2.3.3: There seems to be some inconsistency about whether the resource key and/or
<path>
(which are equated to each other in the first sentence of this section) contains a leading solidus ("/"). The illustration in sec 2.1 suggests not, but the restriction to <path-abempty>
would seem to require that it does.
- The grammar is right, and I fixed the IVORN sketch. I think it's a bit less instructive now, but, sure, instructiveness that leads to confusion is a bad deal. Rev. 3070 -- MD
- Sec 4.2: "Registry interfaces will in general offer features for comparing such identifiers with regular expressions ... For instance, with RegTAP an exampleProto 1.0 client would look for capabilities for which
standard_id LIKE ′ivo://ivoa.net/std/exampleProto#query−1.%′
". It's a nitpick, but I wouldn't say the pattern matching language used in SQL is a regular expression; to me (and wikipedia) regexp refers to that syntax used by grep(1)
and friends. Replace by "Registry interfaces will typically offer some pattern matching capability for comparing such identifiers ..." ?
- Went in in Rev. 3037 -- MD
Plus a few spellings and typos:
- sec 1.3: "necessiates" -> "necessitates"
- sec 2.3.3: "semanitcs" -> "semantics"
- sec 2.6: "fragement" -> "fragment"
- sec 3: "resrouces" -> "resources"
- sec 4.1: "affected by these requirement" -> "affected by these requirements"
- Went in in Rev. 3037. And thanks for the feedback, much appreciated -- MD_
--
MarkTaylor - 2015-08-20
I'm not clear the the term <unreserved> is defined anywhere in the document. Presumably it refers to some range of characters defined somewhere (in RFC 3986 perhaps?) but that's not stated. I'd suggest the unreserved characters be defined explicitly here rather than requiring the user to find yet another document.
*
The trouble is that I really want to avoid copying the grammar from RFC 3986, partly because it's fairly long, partly because I don't want to have to update Identifiers when they change/fix things, partly because there are plenty of implementations of RFC 3986 out there and people shouldn't need to check them from compliance. The document says "The rules from RFC 3986 are assumed throughout" in the "Usage of ABNF" section, which, true, lies in the "oh, boring boilerplate text" section of the document. Where would you have expected this statement? -- MD
While this is a relatively straightforward standard, I think it could be even clearer if there were examples given throughout the document and not just in 2.1. This is true of almost all IVOA standards. In particular I believe there should be labeled examples of both legal and illegal usage where at least one example is given for each class of violation and where we give legal examples for both normal usage and to illustrate the limits of valid code.
E.g., Close 2.3.2 Authority with:
Examples:
Legal: nasa.heasarc
n_1a.alph0.02
123.sub [Can start with a number]
Illegal: a2 Not three characters
.mydata.xxx Does not start with alphnumeric character
_mydata.xxx Does not start with alphanumeric [if _ is not considered such I'd put this in because in some systems you can start with an _ when you cannot with a number]
data%20.space Includes percent encoded character
data^2.info Includes reserved character
- Good point. some went in in SVN rev. 3071. If someone has further interesting cases, please let me know -- MD
Note that if % is not one of the unreserved characters, then the first MUST NOT is redundant.
- It's not really, as you can, in principle, percent-encode unreserved characters. You can't in IVOIDs, though. -- MD
If there are additional rules in RFC 3986 that could be broken then those violations should be illustrated too.
Typo: third example in 2.1 where there is a space between .org/ and the following ?
- Uh -- that should be a tilde character, not a space. If there's a user agent that doesn't display that, there's a bug somewhere. In case it's ivoatex: what user agent is that? (let's use email to work that out). Thanks a lot for reviewing the doument! -- MD
--
TomMcGlynn - 2015-09-14
Comments from TCG member during the TCG Review Period: 2015-10-12 - 2015-11-20
!!! SECTION TO BE ADDED ONLY ONCE THE TCG REVIEW PERIOD HAS STARTED !!!
WG chairs or vice chairs must read the Document, provide comments if any and formally indicate if they approve or not the Standard.
IG chairs or vice chairs are also encouraged to do the same, althought their inputs are not compulsory.
TCG Chair & Vice Chair ( _Matthew Graham, Pat Dowler )
Applications Working Group ( _Pierre Fernique, Tom Donaldson )
We approve this document.
--
PierreFernique - 2015-10-31
Data Access Layer Working Group ( François Bonnarel, Marco Molinaro )
It is a very good standard, exactly what's needed for DAL standards: Standard ids and Dataset ids.
Minimal request on architecture diagram change (non blocking request, anyway).
We approve this document.
--
FrancoisBonnarel /
MarcoMolinaro - 2015-11-18
Data Model Working Group ( _Mark Cresitello-Dittmar, Laurent Michel )
Very minor comments..
- Section 2.1:
- The following: ..."URI reserved characters (essentially, anything except alphanumeric characters, dashes, dots, underscores, and tildes is forbidden)"... is confusing to me. It might be clear for a native, but I don't see to what "anything except" and "forbidden" refer?
- Section 2.3.4: Examples for valid query parts
- typo "but outsize of ASCII" -> "but outside of ASCII"
- Section 4.1 Dataset Identifiers
- "DAL standards standards like Obscore" .. cut extra word
We approve this document. --
MarkCresitelloDittmar - 2015-11-16
- Thanks for catching those; I've replaced the double negation with simpler logic ("only [...] underscores, and tildes are allowed"). Fixes went in in volute rev. 3157 -- MD
Grid & Web Services Working Group ( Brian Major, Giuliano Taffoni )
- Typo: Section A.1: "and an IVOID hat only" (hat => that)
- Section A.2 "Changes from 1.3": Should this be "1.2"? I don't see a 1.3 version in ivoa.net/documents.
I approve this document. --
BrianMajor - 2015-10-30
- Uh.... You're (almost) right, it's 1.12 throughout; I've also fixed things in the document history in the preamble, where REC-1.12 was listed as PR-something. Fix went in in rev. 3158. Thanks. -- MD
Registry Working Group ( _Markus Demleitner, Theresa Dower )
I take issue with any case-sensitivity in the part of identifiers, especially given the move via the
RegTAP interface to lowercase ivoids and many other columns during ingest for simpler search. I otherwise approve and am not going to hold up the document on it.
--
TheresaDower - 2015-11-19
- Trust me, I'm deeply unhappy about the rules for comparison of ivoids, but as I said we can't really back out of case-insensitive registry parts without breaking a lot of things (and scheme and authority are case-insensitive by RFC 3986 anyway), and allowing case changes in local parts in turn would make the thing very hard to deal with for data providers. So, I guess all I can say is that the world sometimes sucks. -- MD
Semantics Working Group ( _Mireille Louys, Alberto Accomazzi )
Just a few remarks , but the document is very clear and I approve the final repository version (2015, Oct 12) .
p17.
octohorpe(the # symbol).
would help to prevent people to have to look for the meaning of this word, while it is used so often
in hyperlinks.
P17. potential interoperability issue as explained in Gray, 2012:
could we summarize in just 3 sentences what was the point ? the reference was a Note.
4.2 : Standard identifiers
would not it be clearer to say 'Identifiers for IVOA standards', as a special case of what is being defined in this specification?.
This RFC page (top) points to examples of real resources on the GAVO portal.
This is very helpful to play with real examples.
However I feel uncomfortable to see IVORN used in some places on this page , while the IVOA Identifiers
mentions in 1.1 Definitions (page 5) : 'we now deprecate the term IVORN '.
Any chance to update the vocabulary for this example page?
--
MireilleLouys - 2015-11-25
Education Interest Group ( _Massimo Ramella, Sudhanshu Barway )
Time Domain Interest Group ( _John Swinbank, Mike Fitzpatrick )
Data Curation & Preservation Interest Group ( Francoise Genova )
Knowledge Discovery in Databases Interest Group ( George Djorgovski )
Theory Interest Group ( _Franck Le Petit, Carlos Rodrigo )
Standards and Processes Committee ( Françoise Genova )
Operations ( _Tom McGlynn, Mark Taylor )
Thanks for responding to our RFC comments. Ops IG is pleased to recommend acceptance (modulo one or two typos communicated offline).
--
MarkTaylor and
TomMcGlynn - 2015-10-28