A proposal for vector math in ADQLWith tables containing massive amounts of vectors becoming common (e.g., the collections of low-resolution spectra within Gaia DR3 or the Digitised Byurakan Surveys), giving TAP users a toolset to do server-side work with arrays becomes highly desirable and will significantly enhance the power of ADQL to do server-side analyses. This is an attempt to provide a baseline feature set for that. TAP servers supporting this should declare that by defining a language feature. While no IVOA specification exists for array operations, use the VECTORMATH key from GAVO's ADQL extensions standards record, like this:<languageFeatures type="ivo://org.gavo.dc/std/exts#extra-adql-keywords"> <feature> <form>VECTORMATH</form> <description>You can compute with vectors here. See https://wiki.ivoa.net/twiki/bin/view/IVOA/ADQLVectorMath for an overview of the functions and operators available. </description> </feature> </languageFeatures> Element AccessTo access an element of a vector, write[element-index] , where element-index is an integer-valued expression. In keeping with common SQL practices (and regrettably working against most programming languages), indexes in ADQL are 1-based (rather than 0-based). That is, the first element of an array with N elements has the index 1 and the last element has the index N.
Again in keeping with common SQL practices, accessing elements outside of that range gives NULL.
Basic Math
| ||||||||
Changed: | ||||||||
< < | ||||||||
> > |
| |||||||
Added: | ||||||||
> > |
| |||||||
| ||||||||
Added: | ||||||||
> > | Vector computations* arr_scalprod(vec1,vec2) is the scalar product of two vectors. Where vec1 and vec2 have unequal length, the shorter vector is padded with NaNs to the length of the longer vector. That is, the scalar product of vectors of unequal length is NaN. | |||||||
Array AggregationThese are functions that work like SQL aggregate functions, just on the elements of arrays. These ought to return the types of the elements of the argument (real, double precision, integers).
| ||||||||
Deleted: | ||||||||
< < |
| |||||||
Aggregate Functions for ArraysThe following standard ADQL aggregate functions, applied to arrays, work component-wise:
| ||||||||
Deleted: | ||||||||
< < |
| |||||||
| ||||||||
Changed: | ||||||||
< < | When aggregates are computed over arrays of different lengths, the result undefined for now. [Options would be erroring out, extending with NaN – i.e., extended items are NaN –, or extending with NULL – i.e., extended items are ignored. Postgres chooses the third option for their MIN and MAX, and it's most straightforward in implementation, so it's also what DaCHS does. But it's not necessarily a good idea]. | |||||||
> > | When aggregates are computed over arrays of different lengths, the result undefined for now. [Options would be erroring out, extending with NaN – i.e., extended items are NaN –, or extending with NULL – i.e., extended items are ignored. Postgres chooses the third option for their MIN and MAX, and it's most straightforward in implementation, so it's also what DaCHS does. But it's not necessarily a good idea].
Implementation StatusThe SQL part of an implementation of this in postgresql is in DaCHS //adql RD, the create_array_operator script. The functionality can be tried out at the TAP service at http://dc.g-vo.org/tap. Suitable tables (i.e., with vector-like data) include sdssdr16.main, gaia.dr2epochflux, onebigb.ssa, or dfbsspec.spectra. | |||||||
<--
|