Recommended Edits to ADQL V2.0 PR: Geometric Function Semantics

This document provides suggested edits that addresses Ray Plante's RFC Comment that called for more explicit enumeration of the types and meanings associated with geometrical functions.

(VOQL chair Editing: VOQL Chair has added the late comments from Ray at the end of this page, rather than at the -already closed- RFC pages to avoid confusion):

In general, the aim of these suggestions is provide a crisp definition of each function at the beginning of the section. In many of the function descriptions currently, the type and meaning are given by example, which is not a good practice and can lead to ambiguities. In contrast, unambiguous statements up front will make the document a better reference for implementers, particularly when "looking up" a specific function.

Note that bold within the suggested text below indicate new or changed words.

2.4.2. AREA

Change the first sentence to:

This function computes the area, in square degrees, of the region given by the function's only argument.

Prepend the 2nd paragraph with the line:

The argument can be represented with one of the region functions, BOX, CIRCLE, POLYGON, or REGION.

2.4.3 BOX

Change 2nd sentence to

A box is a special case of Polygon, defined purely for convenience, and it corresponds in meaning to the STC-S "Box" subphrase [4].

The second paragraph sufficiently describes the arguments.

2.4.4 CENTROID

Change the first paragraph to:

This function computes the centroid of the region given by the function's only argument and returns a POINT (See 2.4.11).

Prepend the 2nd paragraph with the line:

The argument can be represented with one of the region functions, BOX, CIRCLE, POLYGON, or REGION.

2.4.5. CIRCLE

Change the first paragraph's first sentence to:

This function expresses a circular region on the sky (a cone in space) and corresponds in meaning to the "Circle" STC-S subphrase [4].

The rest of paragraph sufficiently describes the arguments.

2.4.6 CONTAINS

The return type is given in the 2nd paragraph after the first example, and arguments are defined in the last paragraph after the examples. It would be better to put a crisper definition up front.

Append to the first paragraph:

The first argument is a point or region value representing the contained geometry, and the second argument is a region value representing the containing region. The function returns 1 (meaning true) if the contained geometry is entirely within the boundary of the containing region and 0 (meaning false) otherwise. When the first argument is a point, it is considered inside the containing region if it lies on the containing region's border.

Using the following text, move the contents of the last paragraph to a new paragraph after the first one:

Either argument can be given by the appropriate functions (the region functions--BOX, CIRCLE, POLYGON, or REGION--for the second argument, and the region functions or POINT for the first argument) or by a single column name or alias. When a column name or alias is provided, the value in the column or alias must be interpreted the appropriate value type. Since the two argument geometries may be expressed in different coordinate systems, the function is responsible for converting one (or both). If either argument cannot be converted to the proper geometry in a usable coordinate system, the function should throw an error message (as defined by the service making use of ADQL).

Drop the last paragraph.

2.4.7 COORD1

This function returns the first coordinate value, in degrees, of a position given by the first argument. The argument may be given using the POINT function (See 2.4.12) or a column reference.

2.4.8 COORD2

This function returns the second coordinate value, in degrees, of a position given by the first argument. The argument may be given using the POINT function (See 2.4.12) or a column reference.

2.4.9 COORDSYS

This function returns the coordinate system string value from *the geometry given by the first argument. The argument value may be given as a geometry data type function (POINT, BOX, CIRCLE, POLYGON, or REGION) or as a column reference that can be interpreted as a geometry.

2.4.10 DISTANCE

This is sufficiently described.

2.4.11 INTERSECTS

see suggestions for CONTAINS.

2.4.12 POINT

This is sufficiently explicit.

2.4.13 POLYGON

Insert the following into the second paragraph as the second sentence:

This function corresponds in meaning to the "Polygon" STC-S sub-phrase [4].

The explanation of the arguments is sufficient.

2.4.14 REGION

This is sufficiently described.

VOQL chair Editing: VOQL Chair has added the late comments from Ray at this page, rather than at the -already closed- RFC pages to avoid confusion:

Late comments by RayPlante - 21 Oct 2008

(Moved from RFC page)

I recognize that these comments come after the official RFC, so I don't expect them to be answered by the authors. I hope some benefit could be gotten from at least the simpler items.

Section 1

  • Grammar: 3rd paragraph, 3rd sentence: I think you want to say "Similar to SQL, ..."

  • I'm disturbed by the statement, "...this specification defines syntactical correctness only." For one, this is not true, since much of the semantics of the functions are described (e.g. Table 1). Second, I think we want to be a little stronger about the semantics. Do we really want "SELECT" to mean completely different things in different applications? See comments below about how to tighten this up. For here, I would prefer a statement that says something like:
This document provides the general semantics for the language elements; where these semantics are ambiguous, the specification of the service or application using ADQL should clarify how the elements should be applied.

Section 2.1.2

  • "We can extend the list..." is not really appropriate for a spec. I suggest "The list includes SQL92 reserved keywords, keywords useful for astronomical purposes, and a subset of vendor-specific languages (e.g. TOP):"

Section 2.2

  • I really don't like seeing TOP included in the syntax for 2 reasons:
    • it is not SQL92; therefore, it forces some parsing of the query by front-ends to otherwise SQL92-compliant databases that do not support TOP,
    • it is ill-defined: there's nothing that defines what the "first n-rows" are. It guarantees no repeatability and no ability to page beyond the TOP rows.

Once upon a time (back in Moscow), I recommended that TOP should be relegated to a separate argument of the service or application (e.g. TAP). In this way, ADQL remains closer to SQL92, it would be easier for services (like TAP) to refine its meaning, and implementations could determine whether to fold it into the native SQL call or handle it outside (after an appropriate sorting of the results).

  • Following up on my comment for section 1, I feel we need to be a bit more definitive regarding the semantics. I recommend inserting the following line into the paragraph after the syntax summary (beginning "The SELECT statement defines a query..."):
The query should be interpreted much like an SQL92 query: where ADQL and SQL92 keywords are identical, the ADQL keywords and their operands should be interpreted in the same way as defined in SQL92.

Section 2.4

  • All functions should explain the types and meanings of the arguments and the returned value, just as is done in Table 1. Some of the function definitions do this sufficiently well, but others seem to provide this via examples, which is not a good practice and can lead to ambiguities. For simple changes to rectify this, consult my recommendations listed here.


Topic revision: r4 - 2008-10-22 - RayPlante
 
This site is powered by the TWiki collaboration platformCopyright © 2008-2019 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback