TWiki> IVOA Web>IvoaTCG>2019ARoadmap>RegTAP11RFC (revision 14)EditAttach

RegTAP 1.1 Proposed Recommendation: Request for Comments

Summary

Registries provide a mechanism with which VO applications can discover and select resources - first and foremost data and services - that are relevant for a particular scientific problem. The RegTAP specification defines an interface for searching this resource metadata based on the IVOA's TAP protocol. It specifies a set of tables that comprise a useful subset of the information contained in the registry records, as well as the table's data content in terms of the XML VOResource data model. The general design of the system is geared towards allowing easy authoring of queries.

RegTAP 1.1, as a minor version increment over RegTAP 1.0, is backward compatible. The main differences between 1.1 and 1.0 are as follows:

  • new allowed res_detail values for testQueryStrings
  • more generic type information in schema tables
  • mapped terms in fields for dates and resource relationships to a vocabulary for DataCite compatibility
  • case-insensitive query support from ADQL 2.1
  • new columns for service mirrors, rights, and authentication and authorization of protected data and services
  • new table for alternate identifiers, supporting DOIs, ORCIDs, bibcodes, and future identification schemes
The latest version of RegTAP 1.1 can be found at: A slightly updated, unofficial version including the changes made after RFC comments from Ops, GWS, and Semantics is available from http://docs.g-vo.org/RegTAP.pdf.

Reference Interoperable Implementations

Two separate reference implementations of server-side architecture exist at GAVO (and other archives using GAVO's codebase) and STScI

The TOPCAT client is interoperable with both reference implementations, though it does not use any of the new 1.1 features yet. Likewise, Aladin's registry bundle is being generated from a RegTAP 1.1 query.

Implementations Validators

The RegTAP validator (currently at http://docs.g-vo.org/regtap-val; should this move into the VCS for the standard?) has been updated to cover the main new features.

For reviewers, here's a set of RegTAP queries exercising the main user-visible new features (TAP access URLs above):

alt_identifiers – find VO resources and their titles that have DOIs:

select ivoid, res_title, alt_identifier
from rr.resource
natural join rr.alt_identifier
where alt_identifier like 'doi:%'

rights_uri – find VO resources that have a CC license declared:

select ivoid, res_title, rights_uri
from rr.resource
where rights_uri like 'http://creativecommons.org/%'

or find what license URIs are already in use:

select distinct rights_uri from rr.resource

mirror_url – find mirrors available for a known access url (in this case, indicating that the service is available through https, too):

select ivoid, mirrors.mirror_url
from rr.interface as intfs
join rr.interface as mirrors
using (ivoid,intf_index, cap_index)
where intfs.access_url='http://dc.zah.uni-heidelberg.de/antares/q/cone/form'

authenticated_only – find resources unavailable without authentication (note that we do not claim that's enough to actually operate them; the use case at this point is filtering them out with a view to a VO that has more of them):

select distinct ivoid from rr.interface where authenticated_only=1

vocabulary mapping – use just a single term to find out services of data collections:

select res_title
from rr.resource as res
natural join rr.relationship as rel
where relationship_type='isservedby'
  and rel.related_id='ivo://nasa.heasarc/services/xamin'



Comments from the IVOA Community during RFC/TCG review period: 2019-06-15 - 2019-07-31

The comments from the TCG members during the RFC/TCG review should be included in the next section.

In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your Wiki Name so that authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.

Additional discussion about any of the comments or responses can be conducted on the WG mailing list. However, please be sure to enter your initial comments here for full consideration in any future revisions of this document



Comments from TCG member during the RFC/TCG Review Period: TCG_start_date - TCG_end_date

WG chairs or vice chairs must read the Document, provide comments if any (including on topics not directly linked to the Group matters) or indicate that they have no comment.

IG chairs or vice chairs are also encouraged to do the same, althought their inputs are not compulsory.

TCG Chair & Vice Chair

Applications Working Group

(Review based on the unofficial updated version at http://docs.g-vo.org/RegTAP.pdf)

The changes since v1.0 seem reasonable and well-described, and the document overall seems to be in good shape.

One comment/suggestion:

OAI-PMH is mentioned several times, but I didn’t see a reference to official documentation on that protocol. Should we include such as reference, perhaps being specific about the protocol version if that is important?

-- TomDonaldson - 2019-08-15

Data Access Layer Working Group

Data Model Working Group

Grid & Web Services Working Group

Some minor comments

  • On pages 4 and 5, you refer to "amount" of records or rows. I would avoid this kind of sentence in a standard doc.
    • Hm – not quite sure what you're referring to here, as the text doesn't say "amount" anywhere (at least back to rev. 5428). Or are you objecting to "At the time of writing, there are roughly 20000..."? There, I'd say giving an idea of the order of magnitude this was written for might be a nice service. This and the following replies: -- MarkusDemleitner - 2019-08-13
  • Figure 1. The Caption refers to some tagging that is not clear to me. I would say that this picture is the architectural diagram for this specific standard.
    • Ah, yes – the caption referred to the old arch diagram. Good catch. I'm now writing “IVOA Architecture diagram with the IVOA Registry Relational Specification (shown as ``RegTAP'') and the related standards.”
  • Page 4. About the SOAP implementation, the sentence is quite generic; maybe you can drop it.
    • Are your referring to “Built on SOAP and an early draft of...”? If so, I'd rather keep it, since it explains why we abandoned RegTAP's (incompatible) predecessor, which might otherwise seem just a whim.
  • page 5 terminology: relational registry == rr. It is clear but better to specify the prefix.
    • Now mentioning the schema name (rr) in the first paragraph of the section.
  • As you discuss SecurityMethod with some details, I would include a reference to the "standard."
    • Now saying “usually taken from the SSO document \citet{2017ivoa.spec.0524T}”; the trouble is that it's still not clear where we'll say what the content of SecurityMethod actually is, so the reference is only half pertinent.
  • Page 9 first-line the formatting splits "xs:token" into two lines, this could be misleading (xs:token become xs:to-ken), this is a "nano" comment but better to keep it on one line.
    • Frankly, I'm happy that the beast hyphenates this (rather than letting it stick out). And with token, I'd consider the risk for misunderstandings is minimal.
  • Page 12: some prefix refers to two versions of the same schema (e.g., ssap and vs), maybe it is not necessary to cite both.
    • Well, the situation that a single canonical prefix corresponds to two different namespace URIs is ugly enough to be explicit about it. Incidentally, we won't have any more of that thanks to the XML versioning policies. Let's hope the old namespace URIs with shared prefixes will die out soon (so we can finally forget about this uglyness).
  • I agree with Mark that it could be useful to summarize the Appendix.
    • It's there now. All changes mentioned here went in in Volute rev. 5571.
-- GiulianoTaffoni - 2019-08-12

Registry Working Group

Semantics Working Group

Remarks from Carlo Maria Zwölf

  • Page 4
    • "Even if it were, data discovery would at least be fairly time consuming if
      each client had to query dozens or, potentially,hundreds of publishing
      registries
      ". --> If a client performs queries in parallel, the number of
      registries is not a matter. In some cases a central service gathering all the
      information could be a bottleneck of the IVOA infrastructure. A discussion
      would be welcome on these aspects. It it also important to mention sync-issue
      between the publishing registries and the RegistryOfRegistries.
    • "this first attempt which was quickly" --> check if 'which' should be removed
    • "The simplification yields 14 tables" --> I would suggest to replace by "The simplification
      yields to a schema composed of 14 tables"
    • TAP_SCHEMA is not defined. What does this stand for? Why is it in red? What is the implied
      color convention?
  • Page 5
    • "The largest table,table_column, has about a million rowsat the time of
      writing
      ". --> A standard document is neutral about the practical and specific
      implementation. For highlighting this neutrality I would suggest to write
      something like 'If we use this standard for describing all the resources available in the publishing
      registries, then the table table_column would contain about a million row'
    • "table_column" --> why in green. What is the color convention?
  • Page 6
    • Caption of figure 1 : "Relational Registry" tagging does not appear in the
      figure, as it is said in the caption. Please highlight the current described
      standard in red on the figure.
    • "...table using xpaths into the registry documents.This document should not
      in general
      " --> Does 'this document' refer to the registry document or the
      current document? It is ambiguous.
  • Page 15
    • In both TAP_SCHEMA and the VODataService tableset, this schema MUST be
      associated with a utype matching the data model identifier given in sect. 7"
      --> It would be more clear to write "In both TAP_SCHEMA and the VODataService
      tableset, the rr schema MUST be
      associated with a utype matching the data model identifier given in sect. 7
    • "On the values in the utype columns within TAP_SCHEMA except for the schema
      utype, see section 6
      ." --> Please check this sentence. It is not clear for me.
In general --> Please define color convention and differences between red & green parts of text.

Summary response

* As to typos and style proposals not mentioned below: Should all be fixed in Volute rev. 5574 – thanks for the careful reading.

  • I've added a few words on the typographic conventions at the end of sect. 1.1. They were, indeed, not obvious. I hope the logic is a bit clearer now.
  • As to a discussion of the general registry design, I think this document isn't quite the right place to do that; if a deeper analysis of these fundamental questions is desired, I'd say it should go to Registry Interfaces.
  • As to writing "If we use this standard..." -- given that this is an update and RegTAP has been in fairly active use for quite some time now, I'd say that'd be a bit too self-deprecating. I'd be ok with "The largest table in existing RegTAP 1.0 services at the time of writing" or something like that if that's actually preferred. On the question of whether to have size hints at all here see the response to the GWS comments.
-- MarkusDemleitner - 2019-08-14

Data Curation & Preservation Interest Group

Education Interest Group

Knowledge Discovery Interest Group

Solar System Interest Group

Theory Interest Group

Time Domain Interest Group

Operations

  • As of 2019-06-13, the "latest" version pointed at from this RFC page is http://ivoa.net/documents/RegTAP/20190503/index.html, but the current latest version is actually http://www.ivoa.net/documents/RegTAP/20190529/index.html. It's 20190529 that I'm commenting on here.
  • Fig 1: the caption says '(tagged with "Relational Registry")' . I can't work out what this note means - should it say something like '(tagged as "RegTAP")' ?
  • Appendix D: In this version (20190529) all the changes since REC-1.0 have been condensed into a single section (D.1) to avoid extra content that only serves historical interest of VO archaeologists; I think this is good practice. By the same token, does it makes sense to delete all the sections (D.2-D.9 in 20190529) that detail how it got to REC-1.0? As an issue affecting preparation of all REC-track documents, this is probably something which also ought to be discussed at TCG or SDP level.
    • I'm not really leaning one way or another, but I'm not sure what condensing the history from... well, nothing, really, to 1.0 would entail. I could condense the history from the first draft to 1.0, but is that really something that's useful to people? I'd tend to leave things as they are for now and see what TCG and/or SDP have to say for the next version. -- MarkusDemleitner - 2019-08-14
  • Appendix D.1: One change in the text which I don't see noted here is changes to the standard_id constraints in the example queries from section 10, e.g. 10.3 in REC-1.0 has WHERE standard_id='ivo://ivoa.net/std/sia' , and in PR-1.1-20190529 has WHERE standard_id LIKE 'ivo://ivoa.net/std/sia%' . I appreciate this may be a result of changes elsewhere in the standards landscape rather than in RegTAP itself, but it would be useful for people making use of these example queries if the change log summarised what's changed in recommended practice since REC-RegTAP-1.0. (There's also a correction to the argument order in the ivo_hashlist_has invocation in section 10.3 which should probably be logged for completeness).
  • This RFC page says "The TOPCAT client is aware of RegTAP 1.1 features and is interoperable with both reference implementations" . Yes it is interoperable, because of the 1.1->1.0 backward compatibility. However, can you refresh my memory about what 1.1 features it's aware of? I'm not denying it! Just can't remember what I may have claimed.
    • Ouch. I think I was thinking about the authentication experiments when I wrote that, but these have disappeared again. So, I guess we'd need to tone the language down a bit here. If you'd ask me what I think TOPCAT should take up, it would be mirror_url to provide failover for registries and GloTS. And perhaps other things, as that would perhaps make it attractive for VizieR to put in their mirror URLs. -- MarkusDemleitner - 2019-08-14
  • Plus a few typos:
    • Sec 4.1: "whereever" -> "wherever"
    • Sec 6: "specifiation" -> "specification"
    • Sec 8.1, vr:organisation: "is to be references by IVOID" -> "is to be referenced by IVOID" ?
    • Sec 8.10: "slight denormalization of the vr:Relationship type: Whereas..." -> "slight denormalization of the vr:Relationship type: whereas..." ? * Typos fixed in volute rev. 5575.
-- MarkTaylor - 2019-06-13

Mark, I updated the link. Thanks and apologies.

As for the TOPCAT support, I'm unaware of specific 1.1 features and will leave that with other comments for Markus to respond to as author; possibly the intention was noting the schema additions didn't break anything, with altIdentifier queries working and no broken baked-in example queries? Those are the main points that came to my mind when testing my reference implementation with TOPCAT. -- TheresaDower - 2019-07-24

Thanks for updates - all looks OK now. I will think about mirror_url implementation. -- MarkTaylor - 2019-08-14

Standards and Processes Committee


TCG Vote : Vote_start_date - Vote_end_date

If you have minor comments (typos) on the last version of the document please indicate it in the Comments column of the table and post them in the TCG comments section above with the date.

Group Yes No Abstain Comments
TCG        
Apps        
DAL        
DM        
GWS *      
Registry        
Semantics        
DCP        
KDIG        
SSIG        
Theory        
TD        
Ops *      
StdProc        
Edit | Attach | Watch | Print version | History: r23 | r16 < r15 < r14 < r13 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r14 - 2019-08-15 - TomDonaldson
 
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback