Request for Comment: RegTAP v1.0

This document serves as the RFC center for the Proposed Recommendation entitled "Registry Relational Schema, Version 1.0". The version reviewed during the RFC can be found at http://www.ivoa.net/documents/RegTAP/20140227/index.html.

RFC Review period: March 14, 2014 - April 15, 2014
TCG Review period: July 28, 2014 - August 28, 2014
Exec Approved for REC:

To add a comment on the document, please edit this page and add your comment to the list below in the format used for the example (include your WikiName so that authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.

Additional discussion about any of the comments or responses can be conducted on the Resource Registry mailing list, registry@ivoa.net. However, please be sure to enter your initial comments here for full consideration in any future revisions of this document

Notes on Implementations and Validaters

A provisional validation suite as available at http://docs.g-vo.org/regtapval-2014-06.tar.gz

An implementation of this is available on the TAP services at http://dc.g-vo.org/tap, http://voparis-cdpp.obspm.fr/tap, and http://gavo.aip.de/tap (the basic underlying software is identical for those); there is reg.g-vo.org that does DNS-based failover between these.

Another implementation (out of date with respect to XPath/utypes and some DB fields) is available as a TAP service at http://ia2-vo.oats.inaf.it:8080/registry . Even if not fully compliant, the IA2 team (VObs.it) plans to update it to the REC version and fix some other compliance problems to keep it at least as an independent mirror of the GAVO service. This implementation uses both a programming language (Java) and RDBMS (MySQL) different from the GAVO one.

Implementations are also being worked on at ESAC and STScI; these were reported on in the Registry sessions of the Heidelberg, Waikoloa, and Madrid interops and should become public fairly soon. In particular, the ESAC implementation has been internally available in ESAC since at least 2014Q1 and has: (a) been validated against the vadlidation suite mentioned above (b) tested / accessed using the TopCAT client.

On the client side, newer versions of TOPCAT already query against RegTAP. Also WIRR uses RegTAP as a backend.

Comments from the community

Comments from MarkTaylor

Great job Markus et al. A few minor comments:

  • Sec 1.1: 'The entire standard is now known as "IVOA relational registry schema"' - the title of the document has "IVOA Registry Relational Schema" (words in a different order) - is this an inconsistency?
  • Figure 1 caption: "(tagged with RegTAP)" - the box in the figure is labelled "Relational Registry" not "RegTAP", so I think this is wrong.
  • Sec 4: The prefix->namespace table does not appear to be in any discernable order - order it alphabetically either by prefix or by URI?
  • Sec 7.1: The description of short_name contains the text "Applications may use to refer to this resource in a compact display." - words missing?
  • Sec 7.1: It would be useful for reference if the the information about field formatting presented in the text of this section could be repeated in the descriptions column of the table. Specifically:
    • content_level, content_type, rights: note values are hash-separated (this is already done for waveband)
    • creator_seq: note names are semicolon-separated
  • Sec 7.7: "The table_column table models the content of VOResource's column element" - I might be wrong, but I think that should read "VODataService's column element".
  • Sec 7.7: "hence, this column will contain one of NULL, vs:TAPType, vs:SimpleDataType and vs:VOTableType" - since these will be lowercased by the time they appear in the RegTAP table, would it be less confusing to mention them in lower case in this text?
  • Sec 7.8: "analoguos" -> "analogous"; "adviced" -> "advised".
  • Sec 7.9: In the param_description description, "column's contents" should read "parameter's contents".
  • Sec 7.10: "resoure" - "resource".
  • Sec 8: The first parameter of the ivo_hashlist_has UDF is declared with the name "haslist"; I think that should read "hashlist".
  • Appendix A: "suffient" -> "sufficient".
  • Appendix A: This is a long list, and the possibility of typos etc is not negligable. Would it be a good idea to add at the start that in the case of a discrepancy between the xpaths in this list and the XML schemas defined by the relevant standards, one or other of those should be taken as normative?
  • Appendix A: I see a /capability/testQuery/pos/lat for SIA, but no corresponding /capability/testQuery/pos/long (SSAP has both).
-- MarkTaylor - 2014-03-27

Thanks for the useful comments. I hope they're all considered in volute rev. 2497. On the question of precedence of xpaths, I'm now declaring XML schema xpaths as normative and promise Errata in case contradictions go unnoticed. -- MarkusDemleitner - 2014-04-01

Late comments

Some technically late comments:

  • section 4 seems to be missing the slap prefix and namespace from SimpleDALRegExt.
  • while I appreciate (like even) the fact that all the reasonable queries can be performed by natural join, I really don't like multi-column primary and foreign keys. This is especially painful when the extra column to provide the necessary uniqueness is only a smallint that has to be carefully generated while inserting content. Have you considered using the concept of surrogate primary keys? Or at least allowing the implementer to chose the form and datatype of PKs (and hence FKs) so they could implement in that way? For a pure implementation of surrogate PK the ivoid columns would have to be renamed and the implementer would simply be required to provide keys suitable for a natural join. However, just allowing the implementer to chose the datatype for that uniqueness column would enable a variety of simpler techniques to populate it and make the natural joins work correctly. Renaming the ivoid column so it was not used in the natural join would not strictly be needed for correctness, just index efficiency (probably not a concern), so that could be kept as-is... is a known/predicable datatype actually needed?
  • I now always use surrogate primary keys and recently moved to using 128-bit UUIDs, for which there are RDBMS functions (for default) and plenty of libraries with well-known algorithms... sure, 128-bit is complete overkill in the registry, but I'm not sure that smallint is actually enough for simple implementations like an RDBMS sequence or identity column (which I have come to despise for other reasons). With all those long varchar columns, the size of the uniqueness column isn't worth skimping on.
-- PatrickDowler - 2014-05-18

We did consider surrogate keys in the very earliest relational registry data models - see RelationalRegistryDM - However, it was decided that ease of writing queries was the primary goal of the design, and therefore it was useful to have the IVORN in as many tables as possible - so for certain queries fewer joins would be necessary. The smallint part of the composite key is generally easily generated on ingestion from the child index in its parent of the main modelled XML element for that table.

-- PaulHarrison - 2014-05-23

The slap prefix went in in Volute rev. 2645 -- thanks for catching this.

In Volute rev. 2646, I dropped requirements on the artificial foreign keys. Pat (and others interested in this) -- please review http://volute.googlecode.com/svn/trunk/projects/registry/regtap/RegTAP-fmt.html#primarykeys and the diff (this necessitated changes in several places). Does this fix things as far as you are concerned?

-- MarkusDemleitner - 2014-06-03

Yes, making the type for the _index columns implementation specific is sufficient. Thanks.

-- PatrickDowler - 2014-07-24

Comments from TCG members during the TCG Review Period:

WG chairs or vice chairs must read the Document, provide comments if any and formally indicate if they approve or not the Standard.

IG chairs or vice chairs are also encouraged to do the same, althought their inputs are not compulsory.

TCG Chair & Vice Chair ( Séverin Gaudet, Matthew Graham )

-- By this time, is there a second compliant implementation? If so, approved - SeverinGaudet - 2014-10-02

Applications Working Group ( Pierre Fernique, Tom Donaldson )

Good document. Easy to read, easy to understand.
One minor point and a small incoherence - related to this version http://www.ivoa.net/documents/RegTAP/20140627/PR-RegTAP-1.0-20140627.html#changes-20140227]]

1) In 1.2, the document cites the RegTAP-STC IVOA standard but without providing a reference, and I did not found it on the IVOA site. Is it a future note ? a future draft ? At this stage, I suggest to just remove this citation to this possible (but hypothetical) future standard extension.
2) There is an incoherence in 1.2, "ADQL, v2.0 (...) we give three functions ..." and in section 9, four user functions are defined.

The standard RegTAP is approved by the chair and the vice-chair of the Application WG.

-- PierreFernique - 2014-09-19

Removed RegTAP-STC reference, fixed the three vs. four. Thanks for pointing this out

-- MarkusDemleitner - 2014-10-08

Data Access Layer Working Group ( François Bonnarel, Marco Molinaro )

I apologize for so late a comment which if I understand should have came sooner according to our rules. But IF it's not too much work I would recommend to replace xpath namespace by the original VO Resourse or VODataService xml schema namespace, which is used as the underlying model. This is a consistent mode of operation with what we have in DAL for ssa/obscore/siav2. xpath tells us how it is written but not what is the underlying model. By the way a diagram such as the one in RelationalRegistryDM would be very usefull for understanding of articulation between the tables - FrancoisBonnarel

Approved - MarcoMolinaro

Data Model Working Group ( Jesus Salgado, Omar Laurino )

Current specification is a clear simplification of the old registry interface that allows easier implementation, a clearer interface and, almost sure, better query performance. Maybe, a totally specific REST interface could have been defined but the reuse of the IVOA TAP is also useful to guarantee a fast implementation. There is not impact on DM activities so I approved the document -- JesusSalgado - 2014-11-11

Grid & Web Services Working Group ( André Schaaff, Brian Major )

Looks good. Just a few comments about securityMethod, section 8.8: The interface Table:

  • securityMethod will need to be associated with an interface at some point. Implementations will find that they need to provide different accessURLs for securityMethods to allow differing methods of collecting credentials.
  • The VOSpace 2.1 specification has taken a stab at defining the types authentication methods it supports using an "authType" field. Since access control runs horizontally through most VO specifications, the securityMethod and authType fields should converge as access control matures in the VO.
Approved - BrianMajor

Agreed on associating securityMethod with interface; if multiple securityMethods will be possible on a single interface, a new table will probably be necessary. But let's wait how these technologies will look like in the wild. Thanks for the review.

-- MarkusDemleitner - 2014-10-08

Registry Working Group ( Markus Demleitner, Pierre Le Sidaner)

Not surpringly, we approve.

-- MarkusDemleitner - 2014-11-10

Semantics Working Group ( Norman Gray, Mireille Louys)

This is an admirably clear and well thought-through document. I particularly like the contextualising effect of Sect. 2, 'Design Considerations'.

Sect. 4: This section discusses standard prefixes for certain namespaces. Is it possible for any other namespaces to appear in RegTAP queries? My impression is that this would be unusual but not impossible. Supposing that I wanted to use a different namespace, how would I pick a prefix that I was sure would avoid colliding with a future version of this document? Might it be reasonable to require people to pick prefixes starting 'x-' in this case?

The Semantics WG Chair/Vice-Chair can see no interactions between this PR and the Semantics WG. The Semantics WG approves this document.

Author Response: No -- extra namespace prefixes are not at all uncommon. They occur every time someone prototypes a Registry extension or distributes registry records referring to custom VOResource types. Hence, you're right that the question of collisions needs consideration. The answer I'd offer is that (a) the more typical case would probably be that prototype interfaces become standard, and with the current informal process of prefix minting no updates are necessary as a standard moves to REC and (b) a certain rate of false positives is acceptable, as real-world Registry clients will encounter (fairly large numbers of) misbehaving resources in what they discover anyway. Does that address your concerns?

-- MarkusDemleitner - 2014-10-08

Data Curation & Preservation Interest Group ( Françoise Genova )

Education Interest Group ( Massimo Ramella, Sudhanshu Barway )

Knowledge Discovery in Databases Interest Group ( George Djorgovski )

Theory Interest Group ( Franck Le Petit, Rick Wagner )

The document is fine for me. No interference with Theory. Approved.

Time Domain Interest Group ( John Swinbank, Mike Fitzpatrick )

Standards and Processes Committee ( Françoise Genova )

Topic revision: r26 - 2014-11-11 - JesusSalgado
 
This site is powered by the TWiki collaboration platformCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback