ADQL 2.1 Proposed Recommendation: Request for Comments



ADQL defines an SQL-like grammar adapted for astronomical purpose. It is especially used by the TAP (Table Access Protocol) standard.

Latest version of ADQL can be found at:

Reference Interoperable Implementations

Implementations Validators

Two ADQL parsing validators available in the GitHub repository named Lyonetia:

Both validators work offline and with no assumption on a specific database schema.

A GitHub CI workflow is set up in this repository. It validates all test queries available in the src/adql/ivoa directory thanks to the ADQL validator based on VOLLT.

Two badges are visible on top of the README.md document. The first one (named "Validator test") indicates whether the ADQL validator based on VOLLT compiled successfully or not. The second one (named "ADQL Validation) indicates whether all test queries passed or not. An HTML report is visible by clicking on this second badge (after clicking on the badge, click on the "Master" page to see it). It shows which test file run and which tests failed.



Comments from the IVOA Community during RFC/TCG review period: 25-Apr-2023 - 6-Jun-2023

The comments from the TCG members during the RFC/TCG review should be included in the next section.

In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your Wiki Name so that authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.

Additional discussion about any of the comments or responses can be conducted on the WG mailing list. However, please be sure to enter your initial comments here for full consideration in any future revisions of this document

  • TheresaDower (as an implementer of parts of the ADQL 2.1 standard client and server side on odd architecture), 2023-04-27
    I am so happy to see this moving forward!
    • I see no show-stoppers in new features for the MAST/STScI frequently annoying special case MSSQL server back end.
    • I have already, through the vollt adql library reference implementation and my tweaks to a MAST/STScI fork, added some of the features. This includes COALESCE and ILIKE and some updates to math functionality, mostly for the sake of canned queries PyVO expects for RegTAP 1.1. This has gone well so far. Thank you for the work on this reference implementation!
    • I think deprecating the coordinate system arguments is ultimately good; it removes unforced errors from expectations of servers quietly converting things in the back end, which it seems no one is actually doing (we're ignoring it and defaulting to ICRS)
    • ... that said, given the way a couple PyVO client expectations forcing server support have already caught us by surprise with RegTAP, I am a little nervous about the actual shuffle in operations when clients start dropping the deprecated pieces, and servers are expected to handle it. Hopefully we can improve coordination rolling out changes in the most popular clients and services that speak ADQL; we are already in a better place for global volunteer effort coordination than the past few years.
    • This is all to say... should I be pulling the whole vollt ADQL 2.1 branch for MAST TAP services soon? Is there anything outstanding in it I should wait on? Happy to block some time to work on it.
    • GM Answer:
      • First of all, thank you for all your positive comments.
      • About the optional features and how they are supported by clients and servers, both should know that supported optional features must be declared in the capabilities of the used DAL service (e.g. TAP). If an unsupported feature is used anyway, the server must raise an error in a way depending on the underlying query protocol. It is then up to the client to correctly report this error. It is also up to the client to eventually provide a way to show which features are supported, so that the use can adapt the query expression.
      • [Speaking as the VOLLT developer] The ADQL-2.1 branch contains the full ADQL-library implementation. So, if you want to use only the query parsing and manipulation library, it is all OK. However, the TAP-library is not yet fully-compatible with this new version of ADQL (especially for the features declaration). That's why this branch is still a branch and is not yet merged with master. I am currently working on completing the integration of ADQL-2.1 into TAP-lib. So, in theory, the TAP service in this branch it should work, but things like the declaration of supported optional features is not yet integrated. The best is probably to wait for this branch to be merged with the master one.
      • -- GregoryMantelet 2023-05-03



Comments from TCG member during the RFC/TCG Review Period: 25-Apr-2023 - 20-Jun-2023

WG chairs or vice chairs must read the Document, provide comments if any (including on topics not directly linked to the Group matters) or indicate that they have no comment.

IG chairs or vice chairs are also encouraged to do the same, althought their inputs are not compulsory.

TCG Chair & Vice Chair

Applications Working Group

Data Access Layer Working Group

Data Model Working Group

Grid & Web Services Working Group

Registry Working Group

Semantics Working Group

Data Curation & Preservation Interest Group

Interesting new features - thank Greg and all!

It looks good, more user-friendly, even if the migration needs important work for data-center (eg: WITH, CAST, geometrical functions, units)

Details :

  • (4.2.7) "coord_value" is not defined.
    • GM Answer: This part is an excerpt from the full BNF. So, obviously not all terms are not defined here, and should not be, otherwise it would too long inside the text. However, you're right, this paragraph is about a new function signature with new parameters, so it makes sense here to also includes the definition for <coord_value>. I just added it (see GitHub/ADQL#88). -- GregoryMantelet 2023-04-27
  • (4.2.13) is it possible to use boolean instead of values {0,1} for CONTAINS ?
    • GM Answer: Making boolean functions really returning a boolean value instead of an integer value (0 or 1) was a work initiated with this version of ADQL, but considering the important and complex grammar refactoring work it requires, it has been decided to postpone this for a next version (see GitHub/ADQL#32 for more details). I will speak about that at the May 2023 IVOA Interoperability Meeting. -- GregoryMantelet 2023-04-27
  • (4.3) can you add link to UDF functions validated by IVOA (ivo_..) ?
    • GM Answer: This link is already available in the document. It is referenced in sections 4.2.5 (COORSYS) and 4.3.1 (UDF Overview) (last words of the last paragraph ; but since it is not the first reference to this document the LaTeX template unfortunately hides the link); but I agree the PDF and HTML rendering does not make that kind of references really obvious in the text (except for the first reference occurrence) (or I am using the LaTeX template in the wrong way, which is really possible). It is also the 3rd reference in the References appendix. -- GregoryMantelet 2023-04-27
  • (4.6.1) remove item "the columns in the operands SHOULD have the same metadata, e.g. units, UCD, etc." -
    Tests should be limited to datatype - "same UCD" is clearly wrong (eg.: JOIN on primary-foregin key use meta.id;id.main and meta.id) , furthermore VOTable have not always UCD.
    • GM Answer: For exactly the reason you mention, this item is a SHOULD instead of a MUST. It is just a recommendation and not a requirement to encourage data providers to provide set operation result with as rich and consistent metadata as possible. In this item, the UCD is just an example of metadata on which a data provider can choose to add a strong constraint while performing a set operation. So, it is up to the implementation to decide whether or not this constraint should be set, and on which metadata. Although I agree this can be tricky for UCD, this item could be perfectly valid for units and possibly for any other metadata a data providers can offer. So, since this item is just a recommendation (not a requirement) and is really up to implementation, I suggest to keep this item. -- GregoryMantelet 2023-04-27
Coord syst: Coordinate system parameter deprecation has impacts.
For instance, coordinate system used in VizieR tables depends of each catalogues.
The coord syst is important in case of crossmatch (from an other table or from an UPLOAD)

If the parameter is removed, it requires;

  • TO FIND A WAY TO REPORT THE NATIVE COORDINATE SYSTEM TO USER - in TAP_SCHEMA for instance
    • GM Answer: You are perfectly right. This is however out of the scope of the ADQL standard. This information must be provided by the TAP standard. I know that a new version of TAP is in preparation and that a talk will happen on this topic at the May 2023 IVOA Interoperability Meeting. It seems to be a very good timing to suggest this addition to TAP. Personally, I propose coordinate system information on columns in the VOLLT/TAP library. This can be used as inspiration for next version of TAP. -- GregoryMantelet 2023-04-27
  • to use UDF functions to make the conversion (is it possible to add with explicit example using UDF)
    • GM Answer: It may be possible, though such functions should be defined in the Catalogue of UDF Endorsed Note. Currently, only one conversion function is defined in there: gavo_transform(from_sys, to_sys, geo) (see sec. A.1.5). Note the function prefix: gavo_. So, it is not yet an official IVOA User Defined Function ; it is just a function defined in a single service implementation. I know that other TAP implementations have implemented a such function. If at least 2 TAP services have the same function (except for the function prefix), I think we can make this function official and give it an ivoa_ prefix ; from what I know there are a GAVO and VizieR implementation for a such function. Thus, maybe it is the good moment to propose a new version of this Endorsed Note. Anyway, to come back on giving examples in the ADQL standard, I don't think it is wise to do so now ; it is too early. I prefer to wait until an ivoa_ function is officially endorsed in this note rather than suggesting to use an unstable and soon-deprecated function. Thank you Gilles for this suggestion, it will probably help updating the UDF catalogue, and I keep this idea also for the next version of ADQL. -- GregoryMantelet 2023-04-27
Notre : (4.2.16) COORDSYS function looks strange if system is removed from geometrical function - it asks a background ADQL processing

GM Answer: This function is also deprecated. Once the coord. sys. argument definitely removed from all geometrical functions, the COORDSYS function will also be removed. -- GregoryMantelet 2023-04-27

Education Interest Group

Knowledge Discovery Interest Group

Operations Interest Group

This is generally well-written. Special commendation for the careful change list in Appendix C.1. I am happy to endorse it, but have a few suggestions that should be considered by the authors.

  • I made a few uncontroversial fixes in PR#89 (now merged).
  • Sec 2.1.3 SQL reserved keywords list: The entry END-EXEC doesn't really make sense here, since it is not syntactically capable of use as an Identifier. I suggest its removal.
  • Sec 2.2.3 Search Condition could contain a reference to the IN construct alongside the others listed.
  • Sec 4.2.8 (written by me, blame attributable accordingly): this section promotes use of the DISTANCE()<r idiom for crossmatching. It does not explicitly promote its use in cone-search-like WHERE clauses as well, but I now think it should do. I suggest appending the following paragraph to the end of this section: "Similar remarks apply to cone-search-like conditions in WHERE clauses: clients are encouraged to use, and services to provide efficient implementations of, conditions of the form WHERE DISTANCE(...) < r_max_deg."
  • Sec 4.2.13, 4.2.17, 4.2.18: the sections on CONTAINS, DISTANCE and INTERSECTS all contain a paragraph at the end along the lines: "If the geometric arguments are expressed in different coordinate systems, the XXX function is responsible for converting one, or both, of the arguments into a different coordinate system. If the XXX function cannot perform the required conversion then it SHOULD throw an error. Details of the mechanism for reporting the error condition are implementation dependent." This consideration only makes sense when the coordinates are specified using their optional, and now deprecated, coordsys arguments, so they are a bit confusing for a reader not familiar with the previous version of the standard. I suggest that these paragraphs are either removed, or that a comment is added to the effect that they only apply where the deprecated coordsys argument has been specified.
  • Sec 4.2.20 POLYGON. This section does not explicltly say what counts as inside or outside the specified polygon. It does say that it "corresponds semantically to the STC polygon," which probably answers the question, but the relevant STC definition discusses some non-sky geometries and is quite hard to understand (at least by me). I'm not sure what's best to do here, but possibly a reference to DALI which distills the STC advice into a more readable form might be helpful, if that is the intended semantics here (STC has some additional constraints about the maximum length of a polygon side). In practice it might be best to find out what the de facto standard PgSphere implementation does and make sure that advice is provided here which matches that. Or maybe I'm overthinking this and it's fine as it is.
    • Commenting on my own comment, following AlbertoMicol's polygon presentation at the Bologna Interop - my remarks here may be misguided or wrong. It could be worth considering whether the existing text is optimal in the light of Alberto's comments, but I no longer claim to have a good idea about what the right thing to do is. -- MarkTaylor - 2023-05-10
  • Sec 4.3.1: "The ivo prefix is reserved for functions that have been defined in an IVOA specification" - this isn't quite true (assuming that "specification" refers to a Recommendation-track document) since the Catalogue of User Defined Functions referenced in the next paragraph is an Endorsed Note rather than an IVOA Specification. Could be reworded as "... have been defined in an IVOA specification or Endorsed Note". It might also be a good idea to use examples from the Catalogue of UDFs rather than RegTAP in the examples here, since that will be the more typical case.
  • Sec 4.3.2: This section discusses declaration of UDFs as TAPRegExt Language Features. It quotes BNF for UDF signature syntax from TAPRegExt, including the terminal <type_name>. TAPRegExt 1.0 [or 1.1] sec 2.3 says "The type_name nonterminal is not defined by the ADQL grammar [in version 2.0]. For the purposes of TAPRegExt, it is sufficient to assume it expands to some sort of SQL type specifier." Since ADQL 2.1 now has introduced a type system, it would be appropriate to say something about type_name in this ADQL section beyond implicitly deferring to TAPRegExt (which refers to an earlier version of ADQL). I suggest something like "The <type_name> SHOULD be one of the terms defined in Section 3."
  • Sec 4.6.1, 4.6.2, 4.6.3: These sections are quite repetitious.
    • They each contain a list of MUST/SHOULD criteria which are almost, but not quite, the same as each other (I think by oversight rather than design). But in fact I don't think the lists quite make sense in any of these cases. The first two items (number/type equivalence of columns) are indeed MUST criteria for the operations, but the other two are not. I suggest retaining the first two as MUST criteria as at present, and replace the metadata advice with "In addition the operands SHOULD have the same metadata e.g. units, UCD etc. The metadata for the results SHOULD (or MUST?) in any case be generated from the left-hand operand."
    • They each end with a warning that no clever unit manipulation will be done, ending with the displayed formula "2m = 2km." In my opinion the whole paragraph is kind of obvious and could be omitted; but at least I suggest to remove the (obviously untrue!) sentence "2m = 2km", which doesn't really add anything.
    • It might be worth pulling out both these items and saying them once near the start of sec 4.6 rather than repeating them in each of the subsections 4.6.1, 4.6.2 and 4.6.3.
  • I find the presentation style for ADQL examples in which each parameter is written on a different line, leading in most cases to examples extending over many lines when they could be fit on one, a bit annoying. But that's just my personal preference, if the authors prefer it as it stands that's fine.

-- MarkTaylor - 2023-05-07

Radio Astronomy Interest Group

Solar System Interest Group

Theory Interest Group

Time Domain Interest Group

Standards and Processes Committee


TCG Vote : 6-Jun-2023 - 20-Jun-2023

If you have minor comments (typos) on the last version of the document please indicate it in the Comments column of the table and post them in the TCG comments section above with the date.

Group Yes No Abstain Comments
TCG        
Apps        
DAL        
DM        
GWS        
Registry        
Semantics        
DCP        
Edu        
KDIG        
Ops *      
Radio        
SSIG        
Theory        
TD        
<nop>StdProc        



<!--

Set ALLOWTOPICRENAME =<span class="WYSIWYG_PROTECTED"> TWikiAdminGroup </span>

-->

Edit | Attach | Watch | Print version | History: r24 < r23 < r22 < r21 < r20 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r24 - 2023-05-10 - MarkTaylor
 
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback