Historical comments on initial TAP V1 RFC period

Comments from the community

  • May I once more request that the note in 2.3.4 recommends an empty string as the system when columns references are used and tells the system to insert the system actually used? Without this, it becomes virtually impossible to write spatial queries running on services using different coordinate systems (this would require an editorial change in the paragraph above to allow an empty string in addition to STC's reference frames. -- MarkusDemleitner

  • Isn't there a requirement for implementations or prototypes before a standard can go to RFC? Please can somebody post the service URLs of these, so that I can try out TAP for real? -- Roy Williams

It appears that the appearance of a formal RFC has frozen comments on TAP except for Roy's question. So let me try to restart things...

First, it seems like there was quite a bit of discussion of TAP ongoing at the time this RFC was started. I think it would have been desirable to write another draft attempting to accommodate the ongoing discussion before the RFC, but perhaps that was not possible.

Second: I've submitted two sets of detailed comments in the DAL list. I don't intend to repeat all of them here, but I would like to reiterate the overarching theme of many of both the substantive and editorial concepts: the TAP standard should be disentangled from the HTTP standards it is built upon. E.g., editorially TAP should say that it uses the standard HTTP conventions for keyword names and values but the current example practice where we use pseudo (i.e., improperly encoded) URL fragments is confusing and wrong. It's not the case that parameters are necessarily joined using &'s. Any Web page is free to use the Multi-part mime-type encoding even if it does not upload files. This is a detail of the HTTP implementation that TAP should not expose.

The major substantive change that I (strongly) suggest -- that we not tie the idea of file uploads to a particular kind of HTTP parameter but simply establish a keyword namespace (or namespaces) for file uploads -- similarly distinguishes TAP and HTTP. TAP should use the abstraction of keyword/value pairs from the HTTP standard. If it is properly layered on top of HTTP, it does not worry about the details of how these are encoded -- that's outside of its purview. So if a keyword happens to be in the table upload name space, then the value of that keyword is a table upload -- regardless of its encoding. In practice that will likely use a <input type=file> entry on some Web form. But there is nothing gained by our mandating that any such data is a table upload -- and very significant cost. For instance we preclude file uploads being used compatibly for any other purpose

I have no problem with a non-normative appendix discussing HTTP and giving examples of how TAP requests might be sent over it.

  • Comments on Chapter 2, sent to dal mailing list, 12/08/09, by AlbertoMicol

Issues from the mailing list just before and during the RFC period, 2009-08-28, collated here by PatrickDowler

0. purely editorial issues not listed here

1. sec 2.3.6 arbitrarily limits the VOTable to TABLEDATA format

assessment: minor change to remove limitation

2. sec 2.2.2 says "single result" and 2.7.1 says "single table"

assessment: minor change to clarify that the query language dictates how many "tables" are produced by a single query; SQL-based languages like ADQL produce a single table

3. sec 2.3.4 specifies the ISO8601 timestamp format without the T separating the date and time parts

note: testing showed that some RDBMSs cannot parse the format with embedded T, but T is used elsewhere in IVOA

assessment: minor change to require the T; imposes possible work on implementors

4. sec 2.3.8 describes the MTIME request attribute which filters the rows returned from a query, but support is optional so this could be very dangerous

note: MTIME is intended to be used in conjunction with MAXREC to harvest content from one service to another, eg for indexing/search engine purposes, already appears in SSA. It seems poorly understood in the TAP context and can be complex underneath... also worries people when mixed with the concept that unsupported request params like MTIME can be silently ignored, leading to large (MAXREC) query results. MAXREC does protect both service and client worst case.

assessment: hard to fix (fully specify) at this time; we could specify that queries fail if MTIME is used and not supported (rather than ignored), clarify as much as possible, and promise to release a Note about implementing and using it later... or we could drop it from 1.0 and worry about it again for 1.1

5. sec 2.5 specifies use of xtype="adql:POINT" or xtype="adql:REGION" rather than using xtype='STC-S" from the VOTable 1.2 doc

note: ADQL treats point and region slightly differently in a few places and both were recognised as necessary data types; for better interroperability between services, something like xtype="stc:AstroCoords" or xtype="stc:Region" and xtype="stc:ISOTime" would be better and we would have to VOTable-specific or TAP-specific xtypes.... assuming one can extract suitable type-names from STC.

assessment: need the distinction, cannot use only VOTable values

6. the static resource structure specified by TAP is rigid

assessment: substantial change at this time since we do not have defined capabilities to describe the URLs for these endpoints (see #8)

7. inline table upload uses en element name as the intended (destination) table name and as specified too tightly coupled with http (also from Tom's comments above)

note: has inline table upload been implemented in any prototype? is it proven to work or proven to not work? given that uploads have to be VOTable, is this really usable as is anyway? it does avoid the need for a server (http, eg) to serve the uploaded table

assessment: substantial change to fix if the current draft is unworkable...

8. VOSI requests are done via the REQUEST parameter and the sync endpoint rather than plain resources off the base URL of the service, thus entangling the mandatory support for sync (and async) with the location of the VOSI requests... proposal was to access VOSI via resources, make both sync and async optional (must have one, of course), and describe the presence and location of sync and async endpoints via capabilities (thus also relieving part of #6 above)

note: inconsistent with DAL style as specified in SSA 1, consistent with REST and GWS style as expressed in the VOSpace 2.0 draft

assessment: modest work to change VOSI to resources from REQUEST(s), substantial change to make sync and async optional and define capabilities to describe them, but then we get the flexibility asked for in #6 for free

Par. 2.3.4 states that:

if the tables[...] contain [...] spatial coordinates 
and the services support spatial querying via the ADQL region construct,
(then) the service MUST support INTERSECTS [...], REGION, POINT, BOX [...]
That initial if means that REGION support is not mandatory. Correct? (An if...must is quite misleading). If correct, then how to specify that my service does or does not support regions? In my opinion this must be spelt out explicitely in the document.

I would claim useful if the overall philosophy on how to construct a query (first access the metadata, to gather column names, units, etc, then formulate a query) could be highlighted in the introduction.

I would claim very useful for data providers' uptake if tutorials could be provided on various aspects, for example:

  • tutorial (or just more extensive examples) on ADQL usage (including regions, utypes, etc),
  • tutorial on how to publish a TAP service in the registry (including real examples on all VOSI interfaces, etc.)

* comment on TAP_SCHEMA by PatrickDowler

The TAP_SCHEMA.columns table has a column named primary, but primary is a reserved word in (I expect) most databases since it is used to declare primary keys. ... upon reflection and discussion, the concept is useful for users and client software to help make exploratory queries that see the important columns. We just need to change the name.

* comment on the single table output by FrancoisBonnarel

I understood from July discussion that the idea to be able to get more than one table in the output is strongly out of the scope for many of the participants of this discussion including the main designers of the TAP protocol, for the reason they gave in the discussion (see Dal mailing list / July). But anyway, let me reask it with my CDS-oriented view:

If your model has a strong OO structure your dataset/catalog will be structured with several tables of very differents sizes/structure.

If you want to query that in TAP, you will have either to do HUGE jointures, or to make the query in several steps (one for the "main" table, one to several for additional information related to the main one by a common key. None of these solutions are totally satisfactory.

Examples are: Vizier, where many catalogues have a main table and auxiliary ones. Simbad where each object have Data sections of very heterogeneous sizes. The future Generic Data Set where primary information would have to be linked some day to AccesData features, Full ObservationDM metadata, Type oriented DAL services, etc

Would it not be possible to define a parameter allowing a multi table output when it is set to TRUE ? Standard TAP queries will use by default "FALSE", and everybody could be happy?

* Comment by TomMcGlynn [This is submitted long after the RFC, but it seemed reasonable that this should be included here so that users need not scurry to find them in the DAL mailing lists. TAM - 2009-10-09.]

  • The types of columns in the TAP schema are required to be variable length strings, even those which have boolean or integer values. The strings to be used when representing a boolean are not specified. Discussion on the DAL has suggested that the types of several columns in the TAP Schema will be changed.
  • There is no description of the capabilities record that is to be returned by a TAP service in the VOSI getCapabilities. The DAL discussion seems to indicate that defining that is a significant task but how a service is to implement this required capability is not clear.


Comments from the TCG during the normal RFC period (2009 July 10 -- 2009 August 21)

These comments relate to the following, superceded document: TAP V1.0

Applications (Tom McGlynn, Mark Taylor)

See comments above (TAM)

Data Access Layer (Keith Noddle, Jesus Salgado)

Data Model (Mireille Louys, AnitaRichards)

Grid&Web Sevices (Matthew Graham, Paul Harrison)

Registry (Ray Plante, Aurelien Stebe)

Semantics (Sebastien Derriere, Norman Gray)

VOEvent (Rob Seaman, Alasdair Allan)

VO Query Language (Pedro Osuna, Yuji Shirasaki)

VOTable (Francois Ochsenbein, 2009-09-25)

My concerns (http://ivoa.net/forum/dal/0907/1406.htm) are included in Pat's summary; the main points related to VTable are:
  1. possibility of having full VOTable documents as results / for uploads (not limited to a single TABLEDATA table); items 1+2 of Pat's summary, but also the table upload (sec. 2.5). For the VOTable output, Pat's proposal (actual number of tables depends on the query language) looks fine. The table upload is not yet fully defined, clarifications are neeeded.
  2. the MTIME concerns (item 4 in Pat's summary): the proposal to drop this for V1.0 is quite ok for me;
  3. the xtype definition and scope (item 5 in Pat's summary): needs clarification and agreement with VOTable standard; Pat's proposal looks like a duplication of the utype which I would prefer to avoid. The role of xtype and its relation with the query language has to be clarified.

Standard and Processes (Francoise Genova)

Astro RG (Masatoshi Ohishi)

Data Curation & Preservation (Bob Hanisch)

Theory (Herve Wozniak, Claudio Gheller)

TCG (ChristopheArviset, Severin Gaudet)


-- KeithNoddle


Topic revision: r1 - 2009-11-10 - KeithNoddle
 
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback