IVOA DAL WG Running Meeting #11 Wednesday 30 June 2021 - 10:00 UTC - vconf Participants: (13) James Dempsey, Laurent Michel, Mark Cresitello-Dittmar, Dave Morris, François Bonnarel, Marco Molinaro, Mark Taylor, Mirelle Louys, Renaud Savalle, Brent Miszalski, Grégory Mantelet, Jesus Salgado, Petr Skoda Agenda: * SSA Minor update * Wrapping TAP client as JDBC connector * DataLink PR status **SSA Minor update** Suggestions from Vandana during the Interop available at https://wiki.ivoa.net/internal/IVOA/InterOpMay2021DAL/Desai_IVOA_SpectralModel_2021.pdf (slide 16) Mark CD: updating the Spectrum DM completely would take too much time, but adding new Utypes would be ok François Bonnarel: suggested utypes during the Interop are to tag data more than metadata. Ok with the proposal of Mark to update the data model. Eventually the new utypes could be in a Note. Marco Molinaro: suggestion to a simple update of the DM FB: Agree with the suggestion of Marco BM: Will a github repo be setup to allow issues to be raised for SSA? On this general topic, while implementing an SSA service it was not always clear what data model to use, spectrum or obscore I can't remember the specifics at the moment (and maybe in some cases some fields may not have relevant info in the models), but I could dig up some notes and add them to a repo issue. JD: Will request a repo for SSA MM&FB: Longer term expand SIA to cover spectra, cubes, timeseries etc BM +1 FB: an idea was to have a protocol like Data Set SAP that would cover spectra, images, cubes, ... BM: Use dimensionality (spectral, timeseries) to describe data, so treat all data sets as slices of an n-dimensional cube. Will formulate further and discuss in time domain IG JS: Spectral 2 DM has work to cover upper and lower limits. Work with DM group on this JD: Laurent, should it be something to work in the DM working group? Laurent Michel: yes ML: @ Brent +1 for Exploring cube dimensions , and trace. We had an attempt many years ago , with Alberto Micol It needs a combination of dimensions , and axis features , like tuples for instance of the shape (t, em: 1,4, flux) for instance to characterize a a time series of a 4 band light curve , for instance . I am interested to reconsider the topic if you are interested. BM: Glad to hear you have worked on that in the past Mireille! Indeed, it is rather complex and approaching it from a generic tuple perspective sounds promising. I had thought about NAXIS1, NAXIS2, NAXIS3, …, NAXISN as in FITS, but that is quickly redundant for very complex datasets. So yes, very interested to develop this further, especially to try and describe time-domain datasets ML: Great , let try to explore it further , in joint discussion DM/ Time domain ? FB: SDM 2 was stopped due to need to comply with VODML - is this desciion being changed LM: worry about the time scale if we have to make SDM 2 with VODML PS: Order is more complex for Eschelle spectra - absolute order and relative order - useful for discovery and plotting so would like to see two utypes added. Section 4.3 - normalised vs recitified spectra - would be good to have this defined as they aren't the same thing. Normalisation of spectra is useful in spectral analysis of stellar and QSO spectra in a number of problems (e.g. estimation of precise RV for exoplanets, estimation of redshift) but also for machine learning. The so called rectified spectrum is just divided by the low order polynomial. Suggestion to update the DM in that direction. If the fiting function is encoded in SDM the client can easily apply it on-line and plot both versions quickly on request. In ML is used normalization to zero value unit variance. BM: We have some normalised GALAH DR3 spectra with our Data Central SSA service. **Wrapping TAP client as JDBC connector** Use case from Pierre Chanial at European Gravitational Observatory (EGO) Connect Presto SQL engine to TAP https://en.wikipedia.org/wiki/Presto_(SQL_query_engine) Easiest way seems to be wrap TAP client as JDBC. Would this be useful for other projects too? BM: We are keen to use a Presto JDBC for TAP server at Data Central. Perhaps we could compare notes at some stage. We have been looking at upgrading our older TAP service (that works with Presto) to use CASDA VOTools to serve it all up, but we have not finished this (our next step is to try and enable TLS for our Presto server and then we hope it all will just work). Email is brent.miszalski@mq.edu.au DM: does anybody know about such implementation? MT: looked at the JDBC interface. Quite a big one. So, not a bad idea, but it would require time and work. DM: Implementation of the full interface, yes, it would be some work. The idea is more to do just what we need and update it then when needed. BM: Dave, maybe we can share ideas and notes about that. **DataLink PR status** FB: Last few things to solve before moving to official 1.1 working draft. Decided to add new field in response - producttype,m dataproduct_type or content qualifier - start in a non-obscore table (e.g. source catalogue) but want to link to data products, could use this field to describe the products. Alternatives of using existing fields semantics or content type would be poor fits. Option 1: PD: Extend dataproduct_type with extra terms over the current vocabulary Option 2: MD: Have multiple vocabs in the one field - clients would have to determine which vocabulary a term comes from BM: Might we clarify the dataproduct_type issue if we have a more powerful data model to handle multi-dimensional data (as Mireille has commented). Perhaps several pre-determined types could be concatenated together, one per each dimension e.g. time-spectrum. Thinking out aloud here - it could get quite messy as Francois mentions. http://ivoa.net/rdf/product-type is the current proposal DM: is it the first time that we would have multiple vocabularies into one field? If not probably it would worth better defining some these vocabularies so that being able to distinguish the kind of vocabulary is applied. That's why XML has this notion of schema/prefix FB: would it better to add a new field specifying the type of info inside the dataproduct_type or a new field for each additional piece of metadata? MM: is not EPN-TAP already using hashlist? MT: yes. Multiple terms can be listed in that way, eventually coming from different vocabularies. FB: see Issue #42 (https://github.com/ivoa-std/DataLink/issues/42) of the GitHub Datalink FB: Issue 2: Templating - request was made for CDS tema to provide a prototype but there hasn't been time to do it - should we wait for this for 1.1 or wait for it in 1.2 JD: If there is enough in v1.1 then holding templating for 1.2 sounds good