DAL Future - discussion pageThis page is meant to gather opinions, feedback and proposals for DAL future (see Trieste 2016 Fall Interop presentation). The underlying question is what will be the main evolution of the DAL protocols in a near future (SIA, SODA, TAP, ADQL, DataLink, etc.). The main drivers are:
Contribution can take the usual forms: linked pages to topics listed below (or added ones), mailing list follow up from the topics. (please, when creating a page to link from a topic here, use a DALFutureTopic TWiki name for the new page) -- FrancoisBonnarel - 2020-05-08 After a while discussions here have been specialized in various pages : | ||||||||
Changed: | ||||||||
< < | DataLink here | |||||||
> > | DataLink -> here | |||||||
Changed: | ||||||||
< < | SIA2 here | |||||||
> > | SIA2 -> here | |||||||
Changed: | ||||||||
< < | SODA here | |||||||
> > | SODA -> here | |||||||
Changed: | ||||||||
< < | SCS here | |||||||
> > | SCS -> here | |||||||
Added: | ||||||||
> > |
TimeSeries :
* discovery and access ->here | |||||||
Data TypeAdd Time Domain ( HIGHEST Scientific priority: see CSP Time Series use cases)
Experience from SVO, High energy groups (XMM archive, SVOM project), CDS VizieR, Planetary science ESAC, CTU Prag , GAVO, etc. Metadata needed for discovery Spatial coordinate system Time coordinate system : scale, reference position, representation Time, spectral, space and polarisation characterisation and statistics, Raw or mean position, Raw bounding limits, Standard deviation Time sampling characterisation and statistics,Mean sampling step, Sampling step limits, Sampling step standard deviation Total exposure time, Exposure time characterisation and statistics, Mean total exposure time, mean exposure time per step, min, max and standard deviation of exposure time per step Characterisation on the time frequency axis: Periodograms are another representation of data. We can have period(s) for periodic data or variability We can proceed to frequency analysis and provide coefficient and frequencies Phase representation What are the dependant and independant quantities : Nature of the dependant quantities. Which mode are the data? Transient or periodic this can be seen on periodogram or by the Target class. Target name, class, subclass, are needed e.g. SN, eclipsing binary, spectroscopic binary,.... This also gives an hint of the variability type. Reuse of standard vocabulary suggested. It would be nice to answer questions such as : "have we more observations on Wednesdays or every day between one and two o’clock?" Usefull to track artefact
What do the data called «TimeSeries» encompass ? It's a temporal sequence of «measurement points» containing: A time coordinate and either one or several flux(es), with errors, resolution, etc.. or a A time coordinate and either one or several flux(es), with errors, resolution, etc.. or a derivative (mag, mag, diff etc..) a radial velocity (double stars, exoplanets), a position (solar activity) But also : Spectra, images In the latter 2 cases is it better represented as a regular cube with only one sparse axis or as an event list Should we recommend a time representation for standard output (probably MJD) ? Relative time for theoretical data ? -- FrancoisBonnarel - 2017-05-30
POSSIBLE view for a TS discovery and access (ALSO SEE BELOW * Driving extended functionalities) Extend ObsCore with a new TimeSeriesCore table ObsTAP-TS can query both tables together Extend «SIAV2» query interface to new timeSeries specific query parameters And Rename «SIAV2» in DataSetDiscovery -->Archived time series retrieval or DataLink Virtual data discovery (= TimeSeries produced on the fly) in SIAV2 = DsDisc. Access.reference is a SODA url SODA extensions to TS. Beside «cutout» or time selection add: Selection on time frequencies Selection in exposure times Time bining Add Frequency or phase output. rETRIEVAL OF FULL METADATA CONSISTENT WITH EXTENDED DATAmODELS (VO-dml) -- FrancoisBonnarel - 2017-05-30
Software Design
Interface design
On the other side, the Datalink VOTable response doesn't contain any signature allowing to recognize a posteriori that it has a datalink content. That's a pity. This would have been harmless and could simplify everything. This has direct consequence on Aladin v10 (or any other client), which has the code necessary to perform specific actions for DataLink., cannot benefit from all these standardisation efforts and is limited to display the DataLink result as a simple,VOTable without coordinates , or still worse in a Web navigator. The good point is that we are now close to a real usage and no more in prototyping. THis kind of usage is probably wider than what was imagined initially for datalink Aladin is close to make something of it there is now a couple of servers which deliver these things. JUST sad that everybody is using it his own way with the consequency that it remains unusable, while nearly nothing is missing to fix that. So, I would recommend two changes: -A simple method to identify the links and to discover what they are supposed to return solution 1: a specific utype for datalink URL (ex: utype=Access.Reference.Datalink.1.xxx") solution 2: an appropriate LINK in the FIELD definition (ex: LINK content-type="application-stream/votable;datalink" ...) -A signature in the VOTable, for example using an INFO tag or as an attribute of RESOURCE , RESOURCE type="result;datalink"... These "solutions" are just examples to illustrate the idea. WE have to check their validity with regard with the standard evolution. -- PierreFernique - 2017-03-07
We have a core of 4 protocols SIAV2, ObsTAP, DataLink and SODA ? ObsTAP is controled via ADQL and its response is an Obscore table. SIAV2 is a parameter driven service, doesn't requite TAP infrastructure . In some respects it is a PQL ObsTAP (actually parameter language allows more evolution towards virtual data) DataSets can be searched by any of the Spatial, time, band and polarization criteria. Access is managed by SODA / currently only doing cutouts on ND cubes. DataLink provides gluing facility between all these protocols responses and with other services Older protocol "lost "functionalities SSA allows to discover spectra by Spatial and BAND positions / response in discovery mo de standardized with SSA response (some kind of pre-Obscore) SIAV1.0 allows to discover anything with a 2D signature on space, but only Spatial axes are standardized. SSA and mostly SIAV1 have a "virtual data discovery mode" IN that case the retrieval is performing "Server side operations for data access" Possible evolution to extend the functionalities of the new protocol and tackle the TimeSeries CSP priority SIAV2 interface could allow discovery of TimeSeries and Spectra with little extensions (time frequencies characterisation for example) Some functionalities available in SSA/SIAV1.0 have to be added to SIAV2 Virtual data discovery : Basically the service arbitrates the discovery query parameters and propose a matching SODA URL. Specially usefull for TimeSeries where many time the TimeSeries is built from the data content. We could extend the parameters in ObsCore to tackle TimeSeries and spectra, add input paramaters to constrain that and add virtual data functionality This will be both an extension of SIAV2.0 and a new overall "DataSet discovery" protocol. SODA = add spectra and Time Series functionalities Provide Extended metadata consistent with NDcube DM : is this a work for SIAV2 or for SODA 1.1 ? Extended metadata retrieval (any kind of full serialization of datamodels for a given dataproductype) is very close to retrieval of the dataset themselves (or excerpt/transformations of datasets). So it seems that this functionnality is more a SODA one than SIAV2 one... -- FrancoisBonnarel - 2017-05-15
They may not know all the details on the service parameters DataLink service operfors may not know the dataset metadata in detail It could be usefull to add in DataLink the feature that services autodescribe -- FrancoisBonnarel - 2017-05-15
Pushing code to the data
Formats and Languages
TAP evolution
TAP & Healpix[This part is a sum-up of the talk Bringing Healpix and MOC in TAP by G. Mantelet presented at the IVOA Interoperability meeting in May 2017 in Shanghai.] In TAP, with few extensions, it could be possible to get Healpix information and/or to add constraints on Healpix information. Proposed new features:
SELECT ivo_healpix_index(7, POINT(’’, ra, dec)) AS hpx_index, COUNT(*) AS density FROM tycho2 GROUP BY hpx_index
SELECT moc_agg(7, POINT(’’, ra, dec)) AS mymoc FROM tycho2 ...
SELECT * FROM tycho2 WHERE ivo_healpix_index(7, POINT(’’, ra, dec)) IN (12,23,68,69,70)
SELECT * FROM tycho2 WHERE 1= CONTAINS(POINT(’’, ra, dec), REGION(’2/12-20 5/60’))
SELECT t.* FROM tycho2 AS t JOIN TAP_UPLOAD.mymoc AS m ON 1=CONTAINS(POINT(’’, t.ra, t.dec), m.moc1) TAP_UPLOAD.mymoc is a normal uploaded VOTable table with a column named moc1 of type ‘VARCHAR’ and xtype 'MOC'.
SELECT t.* FROM tycho2 AS t JOIN TAP_UPLOAD.mymoc AS m ON 1=CONTAINS(POINT(’’, t.ra, t.dec), m.moc) Instead of uploading a VOTable, a FITS file would be uploaded (the TAP implementation has to allow that). The uploaded FITS file has special headers specifying that it represents neither an image nor a table, but a MOC. Then, it should be considered as such while used in the ADQL query. But TAP allows only the upload of table. So, in order to use the uploaded MOC, the TAP service has to create a table of only one cell containing the uploaded MOC (so, one cell for the entire FITS file). As for a "normal" upload, the name of the table is provided in the HTTP parameter UPLOAD, but there is no name for the column containing the single MOC and that we need to refer to in the ADQL query. To solve this issue, we could agree on a standard name for this column: let's say "moc". So, on the above example we had UPLOAD="mymoc,param:moc.fits" which has been uploaded as the table TAP_UPLOAD.mymoc with only row and one column named "moc". -- GregoryMantelet - 2017-05-29 Position in IVOA landscape
Merging DAL protocols, HiPS and MOC
Data Models
Updating SCSSCS still requires VOTable 1.1 and has some other quirks that makes it needlessly incompatible with the rest of the DAL landscape (in particular, it's totally DALI-incompatible). We should figure out how we can evolve it to be less odd with minimal disruption to existing services and clients (e.g., relaxing VOTable requirements, support for DALI MAXREC, RESPONSEFORMAT, metadata discovery). -- MarkusDemleitner - 2017-03-01 In SCS there are three things which are causing problems for me: 1) The UCD's that are required by the spec are rather outdated. 2) VOTable is required to be 1.0 or 1.1, which is far behind the current 1.3. 3) It requires a column with ucd="ID_MAIN". In UCD1+, this would be meta.id. We do not always have a column with that ucd, but we do have one with ucd=meta.record. So I would propose that the table must have one of meta.id or meta.record. -- WalterLandry - 2017-04-15 Please, continue SCS future discussion at SCS-1_03-Next page, were discussion for SCS-1.1 will take place. -- MarcoMolinaro - 2017-07-11 |