Registry Interface Proposed Recommendation: Request for Comments

This document will act as RFC center for the Proposed Recommendation entitled "IVOA Registry Interfaces, version 1.01". The specification can be found at http://www.ivoa.net/Documents/cover/RegistryInterface-1.0-20090522.html (HTML, PDF, Word).

The original version presented for RFC was http://www.ivoa.net/Documents/cover/RegistryInterface-20080929.html (PDF, DOC). The above version is includes changes resulting from the RFC and TCG review.

Review period: 30 September 2008 to 29 November 2008*
    *extended from original end date.

In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your WikiName so authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.

Additional discussion about any of the comments or responses can be conducted on the Resource Registry mailing list, registry@ivoa.net. However, please be sure to enter your initial comments here for full consideration in any future revisions of this document.

Examples of Independent Implementations

Full Registries (implementing both the harvesting interface and the searching interface):

Publishing Registries (implementing just the harvesting interface):

Client Software:

Comments from the TCG during the TCG Review (08 June 2009 - 03 July 2009)

Applications (Tom McGlynn, Mark Taylor)

I approve this document. [TAM]

Data Access Layer (Keith Noddle, Jesus Salgado)

I approve this document which furthers the excellent work of the Registry WG. (Keith Noddle)

Data Model (Mireille Louys, AnitaRichards)

I approve this document.

Grid&Web Sevices (Matthew Graham, Paul Harrison)

I approve this document (MatthewGraham).

Registry (Ray Plante, Aurelien Stebe)

I approve this document.

Semantics (Sebastien Derriere, Norman Gray)

Approved. (SebastienDerriere)

VOEvent (Rob Seaman, Alasdair Allan)

Approved.

VO Query Language (Pedro Osuna, Yuji Shirasaki)

I approve this document.

VOTable (François Ochsenbein)

I approve the document (FrancoisOchsenbein)

Standard and Processes (Francoise Genova)

Approved

Astro RG (Masatoshi Ohishi)

I approve this document.

Data Curation & Preservation (Bob Hanisch)

Approved.

Theory (Herve Wozniak, Claudio Gheller)

Approved.

TCG (ChristopheArviset, Severin Gaudet)

I approve the document. Appendix A.5 "ADQL for Querying Registries" clarifies very well the decision. Thanks !

A few minor comments

  • as per the new convention, the document name should be PR-RegistryInterfaces-1.0-20090522.html
  • in page 61, "ADIL" should be replaced by "ADQL"
  • apparently Aurelien was forgotten in the authors list

> RayPlante: The author oversight is certainly a historical artifact; I will post a revised version with these corrections.

Comments from the Community during RFC period (ending 29 Nov 2008)

NoelWinstanley 7 Oct 2008

  • The Search operation requires the query to be in ADQL/x format. However, recent versions of the ADQL specification (including the PR) makes no mention of ADQL/x, only the ADQL/s form. The reference given by the Registry Interface document for ADQL/x is to a 2004 working draft, which must have been superceded by the ADQL PR (the URL is broken, BTW).
    • Is it sensible to base the main querying interface to registry on a 'dead' language?
    • Or is KeywordSearch the function that's recommended to be actually used in practice?
    • Should the Search function be marked as deprecated / liable to change ?
    • Should the Search function be marked as optional, and support described in the registry, as for XQuerySearch?
    • Should the 'Search' function be altered to accept adql/s instead?

Is it sensible to base the main querying interface to registry on a 'dead' language?
This is a good question that I would like to hear more comments on and which the TCG should specifically take up. While not ideal, our rational is that as this specification contains a number of other components that are critical to registry interoperability, we thought it important not to slow the document by trying to upgrade the ADQL part (which has an impact on implementations). It was intended that RI be tied specifically to ADQL v1.0 so as not to be confused with the later ADQL revamp; however, I see that the body of the text is not explicit and the reference is incorrect; only the WSDL is correct. Making the ADQL interface optional like XQuerySearch may be a reasonable alternative.

Additional Note: As suggested and endorsed at the October Interop, we will inline the relevent bits of ADQL v1.01 into the RI spec. This may not get added in until after the RFC; see the ADQL spec until then.

AurelienStebe - 05 Dec 2008

I followed the evolution of the standard until that point, so I agree with everything stated in it. I just gave it a final careful review, just in case.
I found the following typos or previous edits leftovers, but nothing to comment on the content.

  • 2.1.3 : the KeywordSearch schema still has the old "orValues" definition. It should be "minOccurs='1'" and no default value. The explanatory table after it is correct, as is the WSDL document in the PR and on the ivoa.net server.
  • 2.1.3 : the paragraph right before the list of metadata to search writes "For each active or inactive resource records [...]". It should be just "For each active resource [...]" as agreed in the Registry WG and stipulated everywhere else in the document.
  • 2.2.3 : the first paragraph refers to "section 2.7" which does not exist.
  • 2.1.2 : the "xpathName" bullet point refers to "section 2.2.1", it should be "2.1.2.1"
  • 2.1.1 : the VOResources example is misspelled. It opens with "VOResource" and closes with "VOResourced".
  • 4.3 : the RegCapRestriction schema restricts from "vg:Capability" instead of "vr:Capability". The schema in the Appendix is correct, as is the XSD on the ivoa.net server
  • 4.3.2 : the Registry Sample Doc uses "vg:OAIHTTPGet" in the "interface" instead of "vg:OAIHTTP".
  • 4.3 : the paragraph before the RegCapRestriction schema writes "Both extension types extension types extend [...]"
  • 2.1.3 : the first paragraph writes "status='acive'", missing a "T".


Comments by TCG during the RFC period (ending 29 Nov 2008)

Chairs should add their comments under their name.

Data Curation and Preservation (BobHanisch)

I defer to the registry specialists on this one. The document certainly looks complete and well thought out.

The document is correctly labeled as Proposed Recommendation, but the text in the paragraph Status of the Document still says Working Draft.

Under Conformance-Related Definitions, "shall" is defined as part of the restricted vocabulary, but it is not actually used in the document. I suggest removing "shall" from this list. The IETF RFC 2119 equates "shall" with "must" in any case, so we might as well use just the one term.

> RayPlante: Done (shall removed)

AstroRG (MasatoshiOhishi)

I appreciate the hard work made by the Registry WG members and its collaborators, and I trust the outcome by these expterts.

TCG (ChristopheArviset)

I trust the specialists on these issues so I basically agree with the document, but I have a few comments/questions for clarification.

  • page numbers should be added on each page
  • section 2.1.2, the ADQL schema referenced is 1.0. Is that correct or should it be 2.0 ?
  • appendix 4 : should we also add new prefixes such a VOSpace? As a more general question, how do we add more prefix in the future when more IVOA standards get approved (eg TAP, SLAP) ? Does that mean we need to update this document each time ? Some clarification would be useful on this process in the document.
  • references : ADQL reference should be updated and STC reference should be added

> RayPlante: ADQL v1.0 explained in appendix as recommended; VOSpace document should recommend prefix; page numbers and STC reference added.

Applications (Tom McGlynn)

I've got lots of comments but few of them are really substantive, they are mostly requests for clarification.

> RayPlante: As noted below, I made a number of changes to either correct mistakes or clarify meaning. A few general comments: I would prefer to introduce the use of footnotes in a future version as I'm afraid that inconsistent use of them now will make things more confusing. I gather the purpose is to make that text less intrusive (than the non-normative Note boxes). I've not moved any text from note boxes to the main body as suggested: while the proper location is debatable, it sounds like it's not a super big deal in any case.

Document says it's a working draft, but it is actually a Proposed Recommendation -- that seems to be a general problem.

> RayPlante: fixed

1.1 I don't believe that it is ever clearly stated that the functional distinction between a publishing registry and a searchable registry is that a publishing registry does not support the search functions of section 2. The statement is made that "it does not need to support...". I think a clearer statement would be something like:

Only searchable registries support the protocols discussed in section 2. All VO compliant registries must support the protocols discussed in section 3.

> RayPlante I've added a paragraph at the end of 1.1 that summarizes what is meant by an "IVOA-compliant registry"; in short, it must support either searching or harvesting or both.

2.1 I don't see what advantage there is in making the distinction between search and resolve operations.

Note that getResource presumably returns 0 resources if the ID is not known (i.e., it is not "one and only one")

> RayPlante: s2.2.2, para. 2 says that it must return a "NotFound" fault message, not an empty list; thus, it is in effect "one and only one".

2.1.1 It seems to me that the error response should be something that is standardized at a higher level than a specific protocol. Too late for that though I suppose.

It was unclear to me if the paging of results is done purely at the request of the user or can be initiated by the server if some constraint is exceeded (as in the OAI interface). Could this be clarified.

> RayPlante: Done with sentence added to the last paragraph on p. 9.

2.1.2 the "hitherto referred to using the "adlq:" prefix" doesn't make sense, since as far as I can tell it was never used up to that point. Do you mean "from now on..." --- that's the opposite of hitherto?

> RayPlante: yes. changed accordingly.

I don't like the idea of a required attribute being set to a null value with Table. Personally I'd prefer some string, e.g., "Registry".

I have no idea what the adql:Column elements look like and I think you should give an example here, or replicate the appropriate item in the schema if it's short enough.

> RayPlante: I have added an example. This comment also reveals another needed correction: the element is actually adql:Arg

Why must there be an implicit addition of the status requirement? This is redundant with a statement in 2.1.1 I think. [Personally I think this is not necessarily a good idea. What if I want to know how many records in a registry have been deleted. These is no way to do this using the search interface even though it is a trivial query to formulate. Maybe some kind of flag allowing old records shoudl be available.]

> RayPlante: perhaps redundant, but consistant

2.1.2.1 is very confusing to me. Move the examples up and put them in the text rather than a box even if they are non-normative.

The first note in 2.1.2.1 is mystifying to me. I'd make it a footnote.

In the note about harvesting... I'm not sure what this means? Am I forbidden to harvest using these methods? I don't think so -- I certainly hope not. We don't tell users what they cannot do. We tell them what they can. I think this note is meaningless. We can recommend use of the OAI protocols for harvesting but that's as far as we should go.

The last note in this section -- which seems to be mostly notes! -- is also somewhat mysterious. I think the issues raised in the first and last notes should be discussed more clearly in a non-normative appendix.

2.1.3 I'm concerned about white space. If the registry contains a field which has two consecutive blanks then I gather that I cannot match against it since consecutive white space on the input is to be compressed to a single white space.

The document suggests that all white-space characters in the search field match all white-space characters in the registry text. E.g., space in the input matches a carriage return in the text.

How about just saying something like: The keyword value is a string where leading and trailing white space is discarded. The string is broken into keyword tokens using white space delimiters except when the white space is enclosed in quote(e.g. ...). The characters ... are considered white space. A token that begins and ends with "'s has them removed.

and leave off any mention of compression.

By the by, is there an escape mechanism for quotes? Currently I don't think I can look for the string " can" versus "can" or for "can not" versus "can not". I don't see why these are explicitly precluded.

> RayPlante: Your recommendation is consistent with the spirit of allowing the implementer determine the details of how the keyword search is conducted. While we can't require that the implementation honor quoted white spaces, we can allow them to. The user may need to experiment with spaces to get the desired behavior or just use the constraint-based search. I have removed the explicit requirement for space normalization.

I'd put the note in the main text.

2.2.1

Personally I'd just reuse the output format from the search.

2.2.2

If "Not found" is thought to be an error, it should give an error response. If it isn't an error, then don't say an "error message" should be added. The use of the "fault" seems to be an effort to sidestep the word error but makes everything less clear to me.

Personally I'd suggest treating NOT FOUND as a perfectly natural response to GetResource but I think the corresponding action for getIdentity is an error.

> RayPlante: A fault is like an exception: it need not reflect a failure, just a condition that is inconsistent with the semantics of the call--i.e. there is the assumption that the identifier does exist (e.g. because it was returned in a search with identifiersOnly=true). The chosen semantics provides a very clear response when this assumption is not correct.

2.2.3 I suspect "hitherto" is again being misused -- though "vg:" does appear once previously.

3.1.1 Is the last item in the list of OAI-PMH useful features "deleted records not returned" a requirement of OAI? I'm not sure if the HEASARC is compliant if so.

> RayPlante: it is an OAI requirement

3.1.2 What is the reason for the "strongly recommended"? It seems to me that this relates to one of the notes in 2.1. I don't think there is any status called strongly recommended. There is MUST and SHOULD. Require it if it's really necessary.

> RayPlante: Some XML tools make it difficult to choose the prefixes used; thus, we stopped short of a requirement. While consistent use is not required for interoperability, it is a big convenience for users.

3.1.3 Make the note a footnote. It's irrelevant to use within the VO.

3.2 I believe the discussion of the validation level belongs in the text not in notes. Perhaps a whole section to itself!

Theory (Herve Wozniak)

No constructive comments. I have appreciated the pedagogical efforts to make this (rather technical) document legible.

GWS (Matthew Graham)

I approve this document.

Data Models (Mireille Louys)

Data Access Layer (Keith Noddle)

VOTable (Francois Ochsenbein)

I approve the document (FrancoisOchsenbein)

VOQL (Pedro Osuna)

Resource Registries (Ray Plante)

I approve this document.

Semantics (Sebastien Derriere)


Topic revision: r38 - 2009-09-22 - FrancoisOchsenbein
 
This site is powered by the TWiki collaboration platformCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback