Difference: TAPImplementationNotes (18 vs. 19)

Revision 192013-10-16 - MarkusDemleitner

 
META TOPICPARENT name="IvoaDAL"
Changed:
<
<
This page is intended to collect points that should be clarified/fixed in future versions of the UWS/TAP/ADQL combo of standards. MarkusDemleitner suggests we should edit much of this into an IVOA Note ("Implementation notes for a service implementing UWS, TAP, and ADQL") and fix standards documents after that as necessary.
>
>
This page was intended to collect points that should be clarified/fixed in future versions of the UWS/TAP/ADQL combo of standards. By now, this material has been moved into a draft note on Volute; see http://volute.googlecode.com/svn/trunk/projects/dal/TAPNotes/TAPNotes-fmt.html and the containing repository.
 
Changed:
<
<
Some points (which should eventually all be reflected here) are raised in
>
>
Additional material on TAP reform is found at:
 
Deleted:
<
<
Please inline comments to existing points as one-level-deeper enumerations (I guess...)
 

UWS

(see also UWSEnhancement -- can we merge the material?)

From Paul's mail

The following points are discussed in the Mail B36FEF85-E316-436E-AC69-2F92D0E0FC5C@manchester.ac.uk dated 2011-05-23 to the GWS list by PaulHarrison

Changed:
<
<
Paul promised to update the UWS in volute to reflect much of this, but it
>
>
Paul promised to update the UWS in volute to reflect much of this, but it certainly wouldn't hurt to have some if it in an "implementation note"-type document.
Deleted:
<
<
certainly wouldn't hurt to have some if it in an "implementation note"-type document.
 
Changed:
<
<
  • Section 2.1.3 HELD status - whilst this might appear to have little utility in current implementations, in future versions where there might be quotas or priorities in the UWS then HELD is a way of expressing within the UWS that the job is accepted in principle, but will not be run until some action (like freeing up some of the quota) is taken.
  • It is probably not made clear enough that the initial values of the parameters (and certainly the possible parameter names) are all established during the initial POST that creates the job and in most cases this is how the job should be driven - The ability to set an individual parameter after job creation is an additional capability that the UWS may offer - it should not offer the ability to create new parameters nor delete existing parameters - in this way a client that just creates the job with the initial POST does not "miss" out on setting a crucial parameter. We could make this clearer by removing the ability to set the individual parameter, as I believe that it was added as a "would be nice" feature without a strong use case. There is only one guaranteed way to set a parameter that all UWS services must implement - in the initial POST that creates the job.
  • Section 2.2.3.2 & 2.2.3.3, Changing execution duration & destruction time - if a service choses not to implement these features, then the standard is clear that a value of 0 should be returned for the execution duration, but I agree it is not clear what should be returned for the destruction time - in the job schema the DestructionTime element is nillable, so that would be appropriate representation in the job XML - however for the value returned at the resource URL then I agree that there is no description of what should be returned in the case where the UWS never deletes a job - you could return a value far in the future.
  • a job can be deleted at any time - it is up to the UWS server side to clean up appropriately
  • Although the current wording of the document does not make this clear enough in every case, the intention is that changing the PHASE of the job is a request by the client to the server, and the client sees whether it has been successful by examining the XML returned by the redirect to the URI /{jobs}/(job-id)/. The allowable transitions are shown by the state diagram within the document. TODO: Decide if invalid transitions should be an error
  • Attempt to update a parameter on a job that's not PENDING: a 403 [Forbidden] status should be returned
  • The text needs updating to say that creating a parameter at any stage other than the initial job creation POST is not allowed.
>
>
  • Section 2.1.3 HELD status - whilst this might appear to have little utility in current implementations, in future versions where there might be quotas or priorities in the UWS then HELD is a way of expressing within the UWS that the job is accepted in principle, but will not be run until some action (like freeing up some of the quota) is taken.
  • It is probably not made clear enough that the initial values of the parameters (and certainly the possible parameter names) are all established during the initial POST that creates the job and in most cases this is how the job should be driven - The ability to set an individual parameter after job creation is an additional capability that the UWS may offer - it should not offer the ability to create new parameters nor delete existing parameters - in this way a client that just creates the job with the initial POST does not "miss" out on setting a crucial parameter. We could make this clearer by removing the ability to set the individual parameter, as I believe that it was added as a "would be nice" feature without a strong use case. There is only one guaranteed way to set a parameter that all UWS services must implement - in the initial POST that creates the job.
  • Section 2.2.3.2 & 2.2.3.3, Changing execution duration & destruction time - if a service choses not to implement these features, then the standard is clear that a value of 0 should be returned for the execution duration, but I agree it is not clear what should be returned for the destruction time - in the job schema the DestructionTime element is nillable, so that would be appropriate representation in the job XML - however for the value returned at the resource URL then I agree that there is no description of what should be returned in the case where the UWS never deletes a job - you could return a value far in the future.
  • a job can be deleted at any time - it is up to the UWS server side to clean up appropriately
  • Although the current wording of the document does not make this clear enough in every case, the intention is that changing the PHASE of the job is a request by the client to the server, and the client sees whether it has been successful by examining the XML returned by the redirect to the URI /{jobs}/(job-id)/. The allowable transitions are shown by the state diagram within the document. TODO: Decide if invalid transitions should be an error
  • Attempt to update a parameter on a job that's not PENDING: a 403 [Forbidden] status should be returned
  • The text needs updating to say that creating a parameter at any stage other than the initial job creation POST is not allowed.
 

Other

Changed:
<
<
  • Section 2.2.3.1 defines the HTTP response code for an accepted job as 303, but does not say what should happen for a rejected job. It should do (200 plus error document??) -- MarkTaylor - 08 Jun 2011
  • The content of the /quote resource is an integer number of seconds (sec 2.1.1), but the content of the uws:quote element is xs:dateTime (schema); this mismatch seems unnecessarily confusing unless there's some rationale I'm missing. -- MarkTaylor - 29 Jun 2011
>
>
  • Section 2.2.3.1 defines the HTTP response code for an accepted job as 303, but does not say what should happen for a rejected job. It should do (200 plus error document??) -- MarkTaylor - 08 Jun 2011
  • The content of the /quote resource is an integer number of seconds (sec 2.1.1), but the content of the uws:quote element is xs:dateTime (schema); this mismatch seems unnecessarily confusing unless there's some rationale I'm missing. -- MarkTaylor - 29 Jun 2011
 

TAP

Changed:
<
<
  • Can we come up with a lightweight way of allowing some sort of (insecure) authentication ("don't publish my queries") while keeping available TAP results for uploads to other servers? --MD
  • UPLOAD parameter spec needs some clarifications.­ --MD
    • Are quoted identifiers allowed as table names? (in DaCHS, they are not)
    • What should hapen if a URL or table name contains a comma or semicolon? (in DaCHS, they are effecitvely forbidden in both table names and in URLs, since there is no way to escape them)
    • When people re-post an UPLOAD parameter, should uploads be added or replaced? (in DaCHS, they are added)
  • xtype=adql:REGION on upload: such columns will usually result in polygons, at least when implementing against pgsphere --MD
  • Require a filename header on inline uploads? (this would make it easy to tell them from "regular" parameters without having to parse all UPLOAD parameters first) --MD
  • One of the columns in the TAP_SCHEMA.columns table is named "size". This is an ADQL reserved word, which is unfortunate. Can be got round by quoting the column name in ADQL, but it's a gotcha which might be worth mentioning. -- MarkTaylor - 08 Jun 2011
  • UWS 2.1.11 discusses how parameters of an existing job can be updated, and says that it's up to the implementation to define what is permitted. As far as I can see this is not really done by TAP, though some examples in the Informative section 5 provide suggestions. It should be clarified. -- MarkTaylor - 24 Jun 2011
  • Should the table metadata (from /tables endpoint and TAP_SCHEMA tables) include metadata about the TAP_SCHEMA tables themselves? Should be made explicit in the TAP standard. -- MarkTaylor - 28 Jun 2011
    • Agreed; since 2.6, second paragraph, says they should be in TAP_SCHEMA, I'd venture it's pretty much implied they should be in /tables, too. -- MarkusDemleitner - 29 Jun 2011
  • Is BOOLEAN a legal TAPType? VODataService sec 3.5.3 says yes, but TAP sec 2.5 says no. Probably the answer is no, but this should be clarified (see this mail) -- MarkTaylor - 13 Jul 2011
  • The wording in TAP section 2.9 is somewhat inconsistent about the format of VOTable error documents. Section 2.9 says "The VOTable must contain a RESOURCE element identified with the attribute type='results', containing a single TABLE element with the results of the query." , and Section 2.9.1 says "The RESOURCE element must contain, before the TABLE element, ..." . However, it's clear that this section is discussing both successful and error outputs, and in the case of an error no TABLE element will normally be present, only one or more INFOs. The intention is clear from the fourth example in sec 2.9.1, but it should be reworded. -- MarkTaylor - 21 Jul 2011
  • There should be some language on what to do with oversized uploads; in the inline case, the server probably should send back a 413 status and just close the connection (which, for common client libraries, will just raise a connection reset exception or so, but there's nothing we can do about this as far as I know) --MD
>
>
  • Can we come up with a lightweight way of allowing some sort of (insecure) authentication ("don't publish my queries") while keeping available TAP results for uploads to other servers? --MD
  • UPLOAD parameter spec needs some clarifications.­ --MD
    • Are quoted identifiers allowed as table names? (in DaCHS, they are not)
    • What should hapen if a URL or table name contains a comma or semicolon? (in DaCHS, they are effecitvely forbidden in both table names and in URLs, since there is no way to escape them)
    • When people re-post an UPLOAD parameter, should uploads be added or replaced? (in DaCHS, they are added)
  • xtype=adql:REGION on upload: such columns will usually result in polygons, at least when implementing against pgsphere --MD
  • Require a filename header on inline uploads? (this would make it easy to tell them from "regular" parameters without having to parse all UPLOAD parameters first) --MD
  • One of the columns in the TAP_SCHEMA.columns table is named "size". This is an ADQL reserved word, which is unfortunate. Can be got round by quoting the column name in ADQL, but it's a gotcha which might be worth mentioning. -- MarkTaylor - 08 Jun 2011
  • UWS 2.1.11 discusses how parameters of an existing job can be updated, and says that it's up to the implementation to define what is permitted. As far as I can see this is not really done by TAP, though some examples in the Informative section 5 provide suggestions. It should be clarified. -- MarkTaylor - 24 Jun 2011
  • Should the table metadata (from /tables endpoint and TAP_SCHEMA tables) include metadata about the TAP_SCHEMA tables themselves? Should be made explicit in the TAP standard. -- MarkTaylor - 28 Jun 2011
    • Agreed; since 2.6, second paragraph, says they should be in TAP_SCHEMA, I'd venture it's pretty much implied they should be in /tables, too. -- MarkusDemleitner - 29 Jun 2011
  • Is BOOLEAN a legal TAPType? VODataService sec 3.5.3 says yes, but TAP sec 2.5 says no. Probably the answer is no, but this should be clarified (see this mail) -- MarkTaylor - 13 Jul 2011
  • The wording in TAP section 2.9 is somewhat inconsistent about the format of VOTable error documents. Section 2.9 says "The VOTable must contain a RESOURCE element identified with the attribute type='results', containing a single TABLE element with the results of the query." , and Section 2.9.1 says "The RESOURCE element must contain, before the TABLE element, ..." . However, it's clear that this section is discussing both successful and error outputs, and in the case of an error no TABLE element will normally be present, only one or more INFOs. The intention is clear from the fourth example in sec 2.9.1, but it should be reworded. -- MarkTaylor - 21 Jul 2011
  • There should be some language on what to do with oversized uploads; in the inline case, the server probably should send back a 413 status and just close the connection (which, for common client libraries, will just raise a connection reset exception or so, but there's nothing we can do about this as far as I know) --MD
 
Deleted:
<
<
 

ADQL

Changed:
<
<
  • The spec omits language that says <separator> (and thus comments) is what actually separates tokens. Thus, a naive implementation of the grammar only allows comments between parts of split-up string literals. The spec needs to be improved, but meanwhile saying "<separator> is this grammar's token separator" or so should do. --MD
  • Decaying INTERSECTS with point arguments to CONTAINS is a major implementation effort without much benefit. Can we please just deprecate it? --MD
  • Can we recommend a simple positional crossmatch function like crossmatch(ra1, dec1, ra2, dec2, radius), all in degrees? People use that a lot, and asking them to write that CONTAINS mess all the time is not nice --MD
>
>
  • The spec omits language that says <separator> (and thus comments) is what actually separates tokens. Thus, a naive implementation of the grammar only allows comments between parts of split-up string literals. The spec needs to be improved, but meanwhile saying "<separator> is this grammar's token separator" or so should do. --MD
  • Decaying INTERSECTS with point arguments to CONTAINS is a major implementation effort without much benefit. Can we please just deprecate it? --MD
  • Can we recommend a simple positional crossmatch function like crossmatch(ra1, dec1, ra2, dec2, radius), all in degrees? People use that a lot, and asking them to write that CONTAINS mess all the time is not nice --MD
 

VOTable issues

Deleted:
<
<
[Started from E-mail by Tom McGlynn] See VOTableIssues
 
Added:
>
>
[Started from E-mail by Tom McGlynn] See VOTableIssues
 

ObsTAP

Changed:
<
<
  • s_region has units "deg" in Table 4, but is unitless in Tables 1, 5 and 6. Unitless is correct, I think. -- MarkTaylor - 28 Nov 2011
  • Some items are listed as "float" and others as "double" in Table 1. They are all "double" or "adql:DOUBLE" in Tables 4, 5 and Table 6. Is there a difference? -- MarkTaylor - 28 Nov 2011
>
>
  • s_region has units "deg" in Table 4, but is unitless in Tables 1, 5 and 6. Unitless is correct, I think. -- MarkTaylor - 28 Nov 2011
  • Some items are listed as "float" and others as "double" in Table 1. They are all "double" or "adql:DOUBLE" in Tables 4, 5 and Table 6. Is there a difference? -- MarkTaylor - 28 Nov 2011
 

Table metadata scalability

Changed:
<
<
There is a scalability issue for the table metadata document ( /tables endpoint) of large databases. The XML description is currently about 0.4Mb for GAVO, 5Mb for HEASARC, and predicted 80Mb for VizieR (see pages 5-6 of this presentation by Gilles Landais). An interactive TAP client will typically want to acquire table metadata from the service before offering the user options on which tables are available. An 80Mb download is too much. The other option as it stands is doing a TAP_SCHEMA query (e.g. SELECT table_name from TAP_SCHEMA.tables - ~0.5Mb for VizieR?), and acquiring column info in a similar way when the user has chosen a table. That's OK, but since it involves actual TAP queries, services may queue the query and delay before responding (can TAP service implementors comment on whether that's a legitimate concern?) , while a flat file access from the tables endpoint can be expected to be served immediately. So, an extension/alternative to the existing tables endpoint format might be a good idea, maybe a practical necessity when VizieR TAP arrives. Gilles' talk quoted above suggests one way to do this, but other variations on the idea of storing metadata for list-of-tables and columns-per-table as separate static documents separately are possible.
>
>
There is a scalability issue for the table metadata document ( /tables endpoint) of large databases. The XML description is currently about 0.4Mb for GAVO, 5Mb for HEASARC, and predicted 80Mb for VizieR (see pages 5-6 of this presentation by Gilles Landais). An interactive TAP client will typically want to acquire table metadata from the service before offering the user options on which tables are available. An 80Mb download is too much. The other option as it stands is doing a TAP_SCHEMA query (e.g. SELECT table_name from TAP_SCHEMA.tables - ~0.5Mb for VizieR?), and acquiring column info in a similar way when the user has chosen a table. That's OK, but since it involves actual TAP queries, services may queue the query and delay before responding (can TAP service implementors comment on whether that's a legitimate concern?) , while a flat file access from the tables endpoint can be expected to be served immediately. So, an extension/alternative to the existing tables endpoint format might be a good idea, maybe a practical necessity when VizieR TAP arrives. Gilles' talk quoted above suggests one way to do this, but other variations on the idea of storing metadata for list-of-tables and columns-per-table as separate static documents separately are possible. -- MarkTaylor - 11 May 2012
Deleted:
<
<
-- MarkTaylor - 11 May 2012
 
Changed:
<
<
>
>

Deleted:
<
<
 
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback