Difference: RegTAP12RFC (14 vs. 15)

Revision 152024-04-09 - MarkusDemleitner

 
META TOPICPARENT name="IvoaResReg"

RegTAP 1.2 Proposed Recommendation: Request for Comments

RegTAP, formally known as the IVOA Registry Relational Schema, is the standard way for clients to query the VO Registry. Version 1.2 adds tables to give the coverage in space, time, and spectrum and a tap_table view intended to replace GloTS. To make use of these features, we require a few optional ADQL features and the extra UDF ivo_interval_overlaps.

Latest version of RegTAP can be found at:

RegTAP is a fairly long standard. For review, it is probably advisable to only skim the unchanged text. For the purposes of this RFC, it is probably sufficient to closely inspect sects. 4.5, 7, 8.15 through 8.18, 9, and 10 (in particular 10.13). See also appendix E.

Reference Interoperable Implementations

Server-side implementations exist in the TAP services at http://reg.g-vo.org/tap and https://registry.euro-vo.org/regtap/tap.

Client-side code exploiting the new spatial tables is present in pyVO; WIRR also has constraints on coverage in space, time, and spectrum.

Support for tap_tables is less common; however, once the data providers fix their metadata records, the content of tap_tables essentially is equivalent to what we already have in GloTS, and hence the editor would argue that all clients using GloTS (e.g., TOPCAT) count as reference implementations of tap_tables.

A prerelease version of TOPCAT available at https://www.star.bristol.ac.uk/mbt/releases/topcat/pre/topcat-full_regtap12.jar makes use of rr.tap_tables for service discovery: in the TAP window the TAP|Service Discovery submenu has an option "RegTAP 1.2". If this is selected (instead of the default option "GloTS") then searching for services in the "By Table Properties" tab uses the rr.tap_tables table from GAVO DC rather than GloTS. -- MarkTaylor - 2024-02-28

Implementations Validators

A validator is part of the source distribution at https://github.com/ivoa-std/RegTAP. See the validator subdirectory. Note that it requires a registry seeded with the data provided, and hence the suite can only be used by the service operators.




Comments from the IVOA Community during RFC/TCG review period: 2024-02-01 - 2024-03-20

The comments from the TCG members during the RFC/TCG review should be included in the next section.

In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your Wiki Name so that authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.

Additional discussion about any of the comments or responses can be conducted on the WG mailing list. However, please be sure to enter your initial comments here for full consideration in any future revisions of this document

Comments by MarkTaylor

This revised standard looks in quite good shape. I have a few minor comments.

  • Section 1: "The simplification yields a schema with 14 tables." That's 18 tables now.
  • Section 3: "In the tables of columns given below, the X_index columns have '(key)' given for type." In fact they don't (and did not at RegTAP 1.1), the tables in Section 8 list them all with type "integer". That should be changed.
    • Oops. That was an implementation detail of the concrete service these descriptions are generated from. This is now fixed (if a bit hacky) in columnstotex.xsl, so we hopefully won't regress on that (but we are in trouble if we have a non-key X_index column. Let's not). -- MarkusDemleitner - 2024-03-18
  • Section 4.3: "RegTAP 1.0 required that most columns containing values not usually intended for display to be converted to lower case on ingestion." Remove "that" or "to".
  • Section 4.3: "When matching against these, queries should use case-insensitive matching, for which this specification offers the ivo_nocasematch user defined function. ADQL 2.1 has an ILIKE operator, which may be used instead." Since RegTAP 1.2 requires ILIKE, this could probably be tidied up a bit.
  • Section 4.4: "Note that with VOTable 1.3, non-ASCII in char-typed fields, while supported by most clients in TABLEDATA serialization, is technically illegal..." The reference to VOTable 1.3 should probably say something like "at least up to VOTable 1.5". You may wish to reconsider the optimistic tone of the comment about future VOTable versions later in that paragraph.
  • Section 9: "significanly" -> "significantly"
  • Section 9.1: The language feature type for COALESCE is ivo://ivoa.net/std/tapregext#features-adql-conditional not ivo://ivoa.net/std/TAPRegExt#features-adql-type
  • Section 10.1: "the new authenticated_only column" - "new" should maybe be replaced with "new at RegTAP 1.1" or similar.
  • Section 18.16, 17: I tend to agree with Anne that the units for the temporal and spectral values could use some minimal explanation, given that this document may be used for reference by people trying to make queries against these tables. I don't think it would hurt to give a non-normative clue somewhere about what these values are. Perhaps tabulated column descriptions could say "Lower/Upper limit in MJD of a time interval..." and "Lower/Upper limit in Joules of a messenger energy...".
  • Various features that are standardised in ADQL 2.1 are mentioned here. Would it make sense to require that RegTAP services are ADQL 2.1 compliant?
I have also added an entry to the Implementations section above.

-- MarkTaylor - 2024-02-28

Your points should be addressed in RegTAP PR #18, https://github.com/ivoa-std/RegTAP/pull/18. Would you have a look? Otherwise, thanks for the helpful comments! -- MarkusDemleitner - 2024-03-18

  • Thanks Markus, all points addressed. -- MarkTaylor - 2024-03-19

Comments by HenrikNorman and SaraNieto

1.2 VOResource, v1.1 - Should the reference to RegTAP 1.1 be 1.2?
1.2 Other Registry Extensions - Is mentioning SimpleDALRegExt 1.0 and 1.1, but not 1.2
1.2 - Reference to TAP1.0 instead of 1.1
8.18 - Double period on end of “with references to both a full metadata record and the record of the TAP service publishing the resource..”
10 - “As discussed there” - Unclear what ‘there’ refers to. The RegTAP 1.0 standard?
10.4 - Missing period after “with references to both a full metadata record and the record of the TAP service publishing the resource”

Section 6. Xpaths (Table 1)

Add the followning paths to the table:
<a href="https://www.ivoa.net/xml/SIA/SIA-v1.2.xsd" rel="nofollow noopener" target="_blank">https://www.ivoa.net/xml/SIA/SIA-v1.2.xsd</a>
<a href="http://www.ivoa.net/xml/VODataService/v1.2" rel="nofollow noopener" target="_blank">http://www.ivoa.net/xml/VODataService/v1.2</a>

References to ADQL in sections 1.2 and 4.3, should they reference to ADQL v2.1?

Section 4.3 It states, "ADQL 2.0 has no operators for case-insensitive matching of strings. Mainly for this reason, RegTAP 1.0 required that most columns containing values not usually intended for display to be converted to lower case on ingestion."

The ILIKE operator in ADQL 2.1 as a means for case-insensitive matching, which can be used as an alternative to custom functions like ivo_nocasematch for case-insensitive comparisons. Thus, ADQL 2.1 provides enhanced support for case-insensitive queries compared to its predecessors, indicating a form of case normalisation through the ILIKE operator for string comparisons.

  • Thanks for your review. Most of your points, where not found by other reviewers, should be addressed by commit 705b016 in https://github.com/ivoa-std/RegTAP/pull/20. However, the v1.2 of SIA and VODataService are not namespace URIs (which is what's in this table). In both cases, their namespace URIs end in v1.1. For the background of this wretched state of affairs, see http://ivoa.net/documents/Notes/XMLVers/20180529/index.html. And I am not sure what to do with your remark on sect. 4.3: Is this a proposal for an improved wording? -- MarkusDemleitner - 2024-04-03
  • Thanks, my points have been sufficiently addressed. Extra credits for finding the issue in 10.4, even though I seem to have made a very misleading comment there. Regarding your comments on the namespace URIs and section 4.3, I refer to Sara. HenrikNorman - 2024-04-04

Comments from TCG members during the RFC/TCG Review Period: 2024-02-01 - 2024-03-20

WG chairs or vice chairs must read the Document, provide comments if any (including on topics not directly linked to the Group matters) or indicate that they have no comment.

IG chairs or vice chairs are also encouraged to do the same, althought their inputs are not compulsory.

TCG Chair & Vice Chair

Applications Working Group

The document is clear and describes the background, the issues and the use.
two comments while reading :

- We'd like the vodataservice xml schema access url to be in 1.2 for the xml scheme.
http://www.ivoa.net/xml/VODataService/v1.2
and not
http://www.ivoa.net/xml/VODataService/v1.1

- the ivo_hashlist_has function should be an integral part of the new version of ADQL, at least as a should, as is the case for ilike.

Added:
>
>
  • Thanks for the review! As to the schema access URL: No, the one with v1.1 at the end is the right one. In the VO, we want our namespace URIs to point to the current schema, and the namespace URI for all version 1 VODataService schemas (regrettably) has the v1.1. See https://ivoa.net/documents/Notes/XMLVers/20180529/ for how we try to patch the misdecision to have minor versions in the original namespace URIs. Since it keeps confusing everyone, perhaps we should think again whether there is a way to fix the namespace URIs without blowing everything up – but I, frankly, am at a loss. As to including ivo_hashlist_has as a mandatory "UDF" into ADQL 2.2: I don't think it should be a UDF then, but some in-language support for "hashlists" might be a good idea (although CAOM, I think, has similar fields using a different separator -- hm). This would be material for ADQL-2_1-Next, though -- MarkusDemleitner - 2024-04-09
 

Data Access Layer Working Group

Data Model Working Group

Grid & Web Services Working Group

Registry Working Group

While reviewing this well redacted document I came across some understanding issues probably due to my own interpretation of the context; nevertheless, I make here a few remarks, mostly for clarity purposes, thinking of future implementers. Some comments are related to the quoting of old versions of related specifications, which may have been done on purpose, so I apologize if the suggestion is incorrect. This review applies to Revision c03f460-dirty, 2024-03-18 17:41:26 +0100.

In the following list S means Suggestion and C means Correction.

  • p4 “At the time of writing, there are roughly 20000 such resource records active within the VO, originating from about 40 publishing registries. “
    • S specify time of writing (“Time of writing version 1.2 of this specification”)
    • C as of 2024-04-02 there are 28733 active resources coming from about 50 publishing registries
  • p6 “RegTAP 1.1 incorporates the concepts from VOResource 1.1 but can represent VOResource 1.0 instances (within the limits laid out below) as well.“
    • S change to “RegTAP 1.2” (current version of this spec) ?
  • p7 “TAP, v1.0 (Dowler and Rixon et al., 2010)”
    • S change to “TAP, v1.1” and update citation (2019) ?
  • p7 “ADQL (Mantelet and Morris et al., 2023)”
    • C missing version number: change to ADQL-2.1
  • p9 “whitespace in them is to be normalized in the sense of XML schema”
    • S specify more accurately in parenthesis what is the sense of the XML schema ?
  • p11 “The relevant vocab- ulary URIs are given in the VOResource specification and its schema file.”
    • S add “in <xs:documentation> elements” at the end of this sentence
  • p12 “For the representation of QNames within the database, these recommended prefixes are now mandatory.”
    • S specify “now” (in which version of the specification: 1.2?)
  • p13 paragraph “Within the Virtual Observatory, … to denote the xpaths.”
    • S rewrite this paragraph for clarity ?
      • Utypes are never a pleasant topic, I'm afraid. If you have hints on what is particularly confusing, I can try my best to improve the presentation, but lacking that I'm afraid I don't know what to do. -- MarkusDemleitner - 2024-04-03
  • p16 “operators should ensure decent performance for queries assuming the pres- ence of the given indexes and relationships.”
    • S define “decent” quantitatively
      • You mean, as in "Response time below X milliseconds"? I don't think that's enforceable, not the least because there may be concurrent queries but also because in complex queries, it is not easy to say which part exactly is responsible for excessive runtimes -- there is a planner in the equation after all. So, I'd plead for "not feasible" on this point. -- MarkusDemleitner - 2024-04-03
  • p17 “Table 2: The tables making up the TAP data model Registry 1.1”
    • C change to “Registry 1.2” (current specification)
  • p18 “This convention saves on tables while not complicating common queries significantly.”
    • S specify what this conventions saves about tables (table sizes ? table numbers ?)
  • p19 “vs:datacollection A resource type intended by VODataService version 1.1 to be used for data-only resources. Data providers should use vs:CatalogResource or vs:DataResource instead”
    • Indicate that this resource type is “deprecated”, if this is the case ?
  • p25 description of column rr.res_table.utype “An identifier for a concept in a data model that the data in this table as a whole represent.”
    • S discuss what happens when several utypes are declared for one table by a harvested publishing registry (VODataService -1.3 does not forbid several utypes for a table <xs:element name="utype" type="xs:token" minOccurs=“0”>) (Is the first one used only ? Is a list made for the value of the column)
      • maxOccurs defaults to 1, so utype is only present 0 or 1 times. Phewy! -- MarkusDemleitner - 2024-04-03
    • Same remark for other columns named utype in this spec
  • p37 “ 3. exactly once for each actual table (i.e., there cannot be two rows in the view having the same (svcid, table_name))”
    • S start with a verb or preposition for clarity and to conform to the way points 1,2,4 of the same paragraph are redacted ?
  • p52 “Since these pieces of metadata do not seem relevant to resource discovery and are geared towards other uses of the respective VOResource extensions, a more complex model does not seem warranted just so they can be exposed.”
    • S specify what “warranted” means (a future version of the schema of this specification ?)

Thanks for your review. Everything that is not commented above should be addressed in https://github.com/ivoa-std/RegTAP/pull/19 -- MarkusDemleitner - 2024-04-03

-- RenaudSavalle - 2024-04-02

Semantics Working Group

Data Curation & Preservation Interest Group

Education Interest Group

Knowledge Discovery Interest Group

Operations Interest Group

Radio Astronomy Interest Group

Solar System Interest Group

I focused on the changes documented from the last version. Please note that I have yet to memorize the contents of the entire IVOA document set. Mea culpa.

Typo: Heading 9.1: "requried" should be "required" (this also appears in the Table of Contents, of course)

Chapter 4, headings (all): In every other section of the document, section headings are in title case; in this chapter, section headings have only the first word capitalized. This should be made consistent throughout the document.

Page 11, Section 4.5 (Vocabulary Considerations), paragraph 1: The last sentence (At the time of this writing..) states that some document containing needed definitions is still in "Working Draft" state. The only reference corresponding to the citation "Demleitner (2019)" in the reference list is to version 2.0 of the "Vocabularies in the VO" recommendation. Version 2.0 was accepted in May 2021, and then superseded by version 2.1 in February 2023. So the rationale in this paragraph is no longer valid. How does this affect the requirements of this recommendation?

Page 13, Table 1: The table lists two versions of the VODataService recommendation explicitly (versions 1.0 and 1.1) but not the current version (1.2). Why not? What is the point of including minor versions in this list? I would expect that the "v1.x" notation would be widely understood. I did not check the version list for the other standard mentioned that have since been updated, but someone should.

Page 19, "vs:datacollection": A reference is made to this resource type being included in VODataResource v1.1 for a specific purpose, but recommends that other resources be used. It would be appropriate, and perhaps compelling, to note that this resource type is, in fact, deprecated in VODataResource v1.2 (or earlier?) and should not be used.

Page 24, Section 8.6 (The res_table Table), paragraph 2: This paragraph makes repeated reference to "VODataSevice v1.0". As far as I can tell, this document is no longer accessible. It should be, for provenance at the least. If this document is lost or otherwise unavailable, the information that an implementor would need in order to follow the instructions in this paragraph needs to be supplied in some other form. Otherwise the requirement is unenforceable and somewhat ridiculous.

  • Interesting... you are right, there is no v1.0 specification on the document repository (but the schema is still there: https://ivoa.net/xml/index.html). Digging out the 1.0 internal working draft might be a service to the community at large, but I frankly cannot see myself even trying – I've not been part of the VO back then. Would it help you if I linked to the v1.0 schema? -- MarkusDemleitner - 2024-02-27
  • Well... I've made a quick census and I've convinced myself we ought to finally update or remove the few records that still use VODataService 1.0 (the schema). Hence, there's now https://github.com/ivoa-std/RegTAP/pull/16 -- MarkusDemleitner - 2024-03-05

Page 25, paragraph 1 (The table should have...): Suggest adding "also" to the final sentence ("Since table_utype is used in data discovery, it should also be indexed."). This is primarily to catch the attention of people skimming the document for workload estimation.

Page 26, paragraph 1 (The table_column table...): Should the mention of VODataService 1.1 be replaced with the current version (1.2)?

Page 35, paragraph 2 (The details of how the MOC-valued...): First, DALI 1.2 was accepted in July 2023, so the future tense is no longer appropriate. Second, I do not see that that document does provide the information indicated (how MOC coverage will be entered, specifically). This may be my misunderstanding, but I see no indication other than "It's a string value" in DALI 1.2, which falls far short of what I would need to know as an implementor. Should this instead reference some specific section of the MOC standard? If not, then some additional explanation of what is expected in this string field is in order.

  • Hu... No, for all I know DALI 1.2 is still under review – we don't even have a PR, have we? Anyway, I have reformatted that part a bit. -- MarkusDemleitner - 2024-02-27

Page 35, last sentence (The rows for time_start and...): The statement that these columns "MUST have 'd' in their unit column" is completely mysterious to me. Why? What unit is "d"? What document would I have to read to translate 'd' into something meaningful? And why on earth would you require a potential implementor to chase down some other document to find the word that would have made this sentence sensible?

Page 36, Section 8.17, last sentence (The rows for spectral_start and...): Previous rant | sed 's/d/J/g'.

  • That's basic VOUnits and TAP. This may look a bit odd in this context, but unless you insist I'd rather not quote these two standards in this particular place; I'd then logically have to explain quite a bit more in many other places, too -- MarkusDemleitner - 2024-02-27.

Typo: Page 36, Section 8.18, paragraph1: The second m-dash should not be followed by a comma.

Page 36, Section 8.18, most of it: I'm having difficulty understanding what the bottom line is for the description in this section. Some reasons:

  • There is a reference to "version 1.2", but I'm not clear on which of the various recommendations in play here that version attaches to.
  • In the numbered list, items 1, 2, and 4 are criteria that seem to depend on intrinsic properties of the table - things I would expect to find within a single row, although that is NOT how they are presented; while item 3 is a criterion based on a relationship between rows.
  • Consequently, and for other more grammatical reasons, I cannot confidently interpret the sentence that begins with "Hence" and ends with the double full stop at the end of item 4.
I was going to offer a suggestion for rewriting, but I couldn't untangle it enough to make a decent effort. If you can explain it to me, I'd be happy to help wordsmith.

Page 39, Chapter 10, paragraph 3: There is a reference to "today's (2019) registry". When I checked my calendar, "today" was in 2024. If the intention is to reference a specific version of a recommendation, then please do so formally. Relative dates like "today" are problematic "tomorrow".

Page 46, "accessURL (!)": The word "legacy" is not sufficiently specific. Ideally, this should indicate a specific recommendation and version number, and also whether that is the point where some significant thing was deprecated, superseded, or no longer supported. Since this is listed as a required table entry, this needs to be clarified. Also, it seems odd that a section called "XPAths for res_detail" indicates that this "accessURL" is required, but then immediately states that is must NOT be in res_detail. So why is it listed as required for res_detail?

Page 52, Appendix C, last line of code: There appears to be mangled text on the last line, which reads, "SELECT * FROM fromes) q)". At the very least, the parentheses in the entire code block are unbalanced.

-- Anne Raugh, 2024-02-26

Theory Interest Group

Time Domain Interest Group

Standards and Processes Committee


TCG Vote : Bambi eyes - more Bambi eyes

If you have minor comments (typos) on the last version of the document please indicate it in the Comments column of the table and post them in the TCG comments section above with the date.

Group Yes No Abstain Comments
TCG        
Apps        
DAL        
DM        
GWS        
Registry        
Semantics        
DCP        
Edu        
KDIG        
Ops        
Radio        
SSIG        
Theory        
TD        
<nop>StdProc        
 
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback