Discussion on Registry Interfaces Version 2

RegTAP material

RegTAP is now available as Working Draft from the IVOA document repository; what's in the document repo reflects the current status as of May 2013. The source is maintained in volute, http://volute.googlecode.com/svn/trunk/projects/registry/regtap -- feel free to fix things.

There is a draft set of (non-vendor-specific) CREATE TABLE statements that define the schema: createtables.sql (RP - 2012-11-16). [Warning: this reflects the draft as it was then; the current draft has some relatively minor) changes -- MarkusDemleitner - 2013-03-05]

A script for generating the schema for MySQL servers (requires version 5.6-4 at least, due to indexing solutions) is available here. Based upon the 2013-03-05 draft it includes also example UDF functions in MySQL scripting language. (M.Molinaro - 2013-04-10)

All this is an attempt to fulfill RestfulRegistryInterfaceReq.

See also Notes from the RegTAP talk at the Heidelberg Interop.

Related Material

There is a draft for a replacement of the Registry Interfaces specification at http://volute.googlecode.com/svn/trunk/projects/registry/RegistryInterface; RegTAP used to be part of this, and while it's in development, work on Registry Interface itself is suspended.

There is also a rough draft on representing STC coverage information alongside the more general metadata in http://volute.googlecode.com/svn/trunk/projects/registry/RegTAP-STC. This is suspended since there is a sentiment is that spatial coverage should use MOCs rather than bounding boxes, but we don't quite know yet how to do that.

Requests for discussion/TODO

This is a list of questions Markus would appreciate input on. Feel free to add discussion points yourself as you see fit.

Please do comment in particular if you disagree with Markus' preference.

  • Do we need rules on lowercasing values in res_detail?
    • Pro (Markus' unloved Favourite; actually, I think this is one of the few real-life TINA situations): res_details values include URIs that must be matched case-insensitively, as well as, e.g., test query fragments that must not be destroyed by futzing with case. Unless we want to force people to remember to use case-insensitive matching (I can't remember stuff like this), there's no other way.
    • Con: There's too many rules already. People can ivo_nocasematch if they look for IVORNs in res_detail
    • Alternative (Markus' Utopia): Some of us enters a time machine, goes back ten years, and stops people from declaring all kinds of things as "case-insensitive".
  • Schema name: rr, ivoa, or yet something else?
    • Proposal 1: Keep the schema name as it is (Markus' Favourite)
      • Pro: No changes necessary, it's short and easy to type
      • Con: Obscore uses the ivoa schema; it seems kinda wrong to usurp further essentially random names.
    • Proposal 2: Insinuate all ivo_% like schema names are for IVOA standardized schemas and grab ivo_rr for the relational registry
      • Pro: the schema name makes clear that this is IVOA standardized while still keeping related tables together. This will scale to future data models that might be more complex than obscore
      • Con: proliferation risks (plus I don't like fixed schema names, but it's my opinion -- MarcoMolinaro)
      • Con: it's a fairly intrusive change that doesn't seem to buy a lot
    • Proposal 3: Pull all the tables into the ivoa schema
      • Pro: Consistent with obscore
      • Con: The ivoa schema will become terribly messy
  • Add strict non-null conditions (rather than and/or in addition to the milder conditions currently imposed by the shoulds on primary and foreign keys)?
    • Pro: In-DB validation might lead to less junk users have to cope with; also, non-null conditions might help implementors
    • Con (Markus' Favourite): don't require anything. What's valid is defined by VOResource anyway, but if people can ingest slightly invalid data without breaking, why punish them (by declaring them in violation of the spec)?
  • Should there be a required table containing VOResource XML fragments?
    • Pro: many implementations that have an OAI-PMH endpoint will have such a table anyway, and certain classes of clients could use this kind of "all info in one" representation
    • Con (Markus' Favourite): it "muddys the water", and it's a major implementation effort if you don't do OAI-PMH anyway; clients needing VOResource can always get the records using OAI-PMH. Also, there's the question of namespace management: Require namespace declarations in all records? Allow them? Forbid them?
  • Do we want creator_seq at all?
    • Pro (Markus' Favourite): Authors and author lists are, for better or worse, magic. Displaying them, and displaying them in order, is an absolute requirement for almost any UI. Not having creator_seq would imply sequence numbers in res_role and uglyfied queries, where there's indeed just this obsession to fix.
    • Con: Not in VOResource, needs logic in the ingestor. Also, since you can't predict what's in the creator fields (comma-separated list, semicolon-separated list, individual authors, something stranger yet), some author lists the ingestor creates will look odd in any case.
    • Alternative (Markus' Utopia): Change VOResource to actually mandate a certain format for creator (and atomic values). I'd still keep creator_seq, but at least its appearance would be predictable. Ah well.
  • Should we have a table mapping prefixes to namespace URI, i.e., a canoical place in which clients can figure out that vr on a particular site means http://foo.bar/v1 rather then http://bar.foo/v2, and so on for vs:, vg:, etc?
    • Pro: clients can do schema discovery this way, FWIW; such data is necessary at least during OAI processing anyway. Also, it's a nice analog to the xmlns declaration in XML documents
    • Con: These mappings are largely static and are amended fairly slowly. Also note that it is intended that there are going to be multiple URI for a prefix (e.g., vs: already has two), so such a table is less straightforward than you may think
  • Should we highlight the impacts of the RDB solution on VO Resource data model (i.e. required/optional fields may need changes switching from XML to RDB implementation of the DM)?
    • Pro: This would be particularly convenient for the res_role that, since we deviate quite a bit from VOResource, look fairly odd in the RegTAP spec.
    • Con (Markus' Favourite): Changing VOResource is a major undertaking (all registries would have to update their records), and it seems we can do without. Let's.
  • Should we keep talking about utypes? Here, there's no Markus' Favourite since I don't like any of the proposals. Also note that currently, utypes are also used in keys in res_detail. Where's Alexander the Great when you need him?
    • Pro: They are what the VO has advertised as "pointing into data models" for quite a while now. It's significant work to change the current text to make those utypes go away. There's the utype column in TAP_SCHEMA, and it'll look odd if it is empty for an IVOA-sanctioned data model (implementation). Also, the obvious alternative xpaths leads to humonguously long strings.
    • Con: The utypes tiger team has come up with a (fairly sensible) document laying out what utypes should be in the future, and they're quite different from what we propose here.
    • Con: Let's use xpaths. They're easier to understand for people with less than five years VO experience, there's no battles for their interpretation. However, the question is: Xpaths where?
      • Proposal 1: actual xpaths into instance documents
        • Pro: almost immediately usable in a wide range of implementations
        • Con: How do we represent namespaces? As prefixes? As URI? Also, these beasts will become terribly long either way
      • Proposal 2: xpaths into the schema documents, which in turn could be identified through their canonical prefixes.
        • Pro: would work quite analogous to utypes, and they may be shorter
        • Con: xpaths into XSD are plain horrible, and they won't work anyway as soon as there's two 1:1 relations in the same object of the same type (doesn't currently happen in VOResource, but anyway)
      • Proposal 3: Introduce some shorthand convention to have xpaths into instances but still allow a compact representation (but how?)
  • Should we have a validation suite?
    • Pro (Markus Favourite): No question at all; a fairly small set of representative resource records and a set of test queries alongside with their results will help a lot in validating implementations
    • Con: It's work, and quite a bit of it. But well: that just means that help is highly appreciated.
Topic attachments
I Attachment Action Size Date Who Comment
Unknown file formatsql RegTAP_MySQL_schema.sql manage 25.1 K 2013-04-10 - 09:43 MarcoMolinaro MySQL (5.6-4+) flavoured schema creation, based on 2013-03-05 draft
Unknown file formatsql createtables.sql manage 5.1 K 2012-11-16 - 20:22 RayPlante non-vendor-specific CREATE TABLE statements that implement DM's proposed model
Edit | Attach | Print version | History: r18 < r17 < r16 < r15 < r14 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r17 - 2013-05-31 - MenelaosPerdikeas
 
This site is powered by the TWiki collaboration platformCopyright © 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback