How to make your TAP tables discoverable

Who should read this? This is mainly for operators of TAP services publishing more than one data collection in one service; in that situation, your TAP service's metadata (like “Who's made this?” “What's the footprint of this stuff?”) is different from your data collection's metadata.

In case you have a one-data collection TAP service, only the section on giving a tableset is relevant to you.

Why should you read this? Aladin version 10 and (most likely) future versions of TOPCAT will pull table metadata from the Registry; that's the right thing to do for many reasons. For background, see Mark Taylor's 2017 talk on the matter and Markus Demleitner's 2015 discussion of the concept. If your table metadata aren't in the registry, your services will look bad in these clients or possibly even disappear entirely.

The Problem in Short

Clients need to discover tables in TAP services – they have to know at least names and titles, ideally much more (e.g., “This table has K-band magnitudes and proper motions”). Registry records can contain such data (in VODataService tableset elements), but they are technically optional. Therefore, many data providers don't give them. However, when they don't, several discovery cases fail entirely, others don't work properly (see, for instance Pierre Fernique's May 2017 mail). So, even though your resource records might be technically valid, they might not work well.

Additionally, when data centers put many tables into one TAP service, even if they give tablesets, the tables are still hard to find because they sit on a completely different level than normal resources and are missing VOResource metadata. This is fixed by registering the data separately and giving it “auxiliary capabilities” (see below).

This little text tries to guide you through fixing your records. Feel free to contact Markus (msdemlei@ari.uni-heidelberg.de) at any time if you get stuck.

Examples for resource records with tabledata and auxiliary capabilities:

(1) Creating Per-Data Collection Records

The first step is to create one Registry record per dataset; this can be per-table, but when some tables form one resource (guideline: common author, common title, (largely) common description, common STC coverage), it's preferable to keep these within a single resource record with a tableset consisting of the table concerned.

These resource records should have the type vs:CatalogService (do not use vs:DataCollection). Actually, you will already have those if you're publishing a particular data collection as a cone search. If you do, do not create extra records and just skip this step.

(2) Giving a Tableset

In order for the registry to say what tables you have, you need to tell it. You do this using the VODataService tableset element (element docs) The good news is: You already know how to produce this kind of document, because it is exactly what you're writing to your TAP tabledata endpoint (modulo the root element). Just put the relevant table(s) into your registry records and you're done (cf. the examples above for minor technicalities).

(3) Adding an Auxiliary Capability

You now need to tell the clients what TAP service your table is searchable with. This happens using an auxiliary capability (in case you are interested in the details and the rationale, refer to Discovering Data Collections). This will be constant for all the table records served by a given TAP service and looks like this:

<capability standardID="ivo://ivoa.net/std/TAP#aux">   
	<interface role="std" xsi:type="vs:ParamHTTP">     
		<accessURL use="base">http://dc.zah.uni-heidelberg.de/tap</accessURL>   
	</interface> 
</capability>

(just change the access URL to match your service). This element goes next to the other capabilities; if you don't have capabilities so far: they come after the content element.

(4) Declaring the Relationship

Since in other circumstances, the access URL on these auxiliary capabilities could be different from that of the main service, you need to explicitly reference your main service. This happens in a VOResource relationship element. Again, this is constant for all table records for a given TAP services:

<relationship>   
	<relationshipType>served-by</relationshipType>   
	<relatedResource ivo-id="ivo://org.gavo.dc/tap">
		GAVO Data Center TAP service   
	</relatedResource> 
</relationship>

You will have to adapt the ivo-id attribute and the content of the relatedResource element. These elements go in as the last elements of content.

Questions and Answers

What do I do if I'm using the web-based registration interfaces at ESAVO or STScI?

We're talking with them right now. Essentially, if you provide VOSI capability and tableMetadata endpoints, things should be fairly automatic overall. You will not, however, get around creating per-data collection records. The easiest way to do that is to have cone search services on them where appropriate. Otherwise, please contact your registrar (and perhaps the contact address above).

Why can't I use vs:DataCollection?

While it might seem as if this registry record type were tailored for this purpose, it really isn't, for the simple reason that it doesn't support capabilities. The intent when DataCollection was designed was to work with relationships only, which in modern discovery systems would become really tedious. But don't worry, there's nothing wrong to use CatalogService even if all you have is the auxiliary capability.


Topic revision: r1 - 2017-05-19 - MarkusDemleitner
 
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback