+ Discussion of additional topics, (mostly) regarding non-mandatory fields (by JuanDeDiosSantanderVela)
  • target_class is defined as being a member of a special vocabulary, either from Simbad, NED, or defined in other IVOA vocabulary. I would either remove the ambiguity by applying a prefix, or better yet mandate that said vocabularies are to be published in SKOS form (as per the IVOA Vocabularies Note), and a reference to them added in a field target_class_vocabulary_url, or similar.
  • obs_creator_name should not be creator_id, to maintain a relationship similar to that between obs_publisher_did and publisher_id? If there is a creator other than the publisher, it should have an entry in a VO Registry.
  • obs_creator_did is not defined clearly. A discussion appears in SSA, that it should either be referred to, or brought to the ObsTAP document (where I think it makes more sense). Plus, it should be made clear that for publishers who are the same entities creating the data, obs_creator_did and obs_publisher_did can (should?) be the same.
  • bib_reference is set with UCD meta.bib, as if it only could contain an ADS bibcode, but some publications might make use of the DOI, more generic than the bibcode. Should DOIs be allowed? Then we would need a new UCD (see list of proposed UCDs to go with ObsTAP).
  • obs_release_date: if obs_release_date is NULL, data_rights (if present), must be "proprietary".
  • s_ucd is proposed to be either pos, or uv, but no UCD for uv exists. I think it should not be pos, but pos.eq (as we are mandating ICRS coordinates). Instead of UV, I propose to have one of the following:
    • pos.uv (with sub UCDs pos.uv.u and pos.uv.v)
    • pos.eq; arith.fft (adding an arith.fft UCD)
  • s_calib_status is said to be one of NOT CALIBRATED, FINE, or COARSE, while for other [te]_calib_status strings are lowercase. Shouldn't this be made so?
  • t_resolution can be made NULL. A recommendation was made to set t_resolution to t_exptime for non time-resolved frames. Wouldn't it be better that NULL mean non-time resolved frames, and time-resolved ones had the corresponding t_resolution value?
  • t_calib_status has no attached enumeration. I see arguments both to support the same one as in s_calib_status (i.e. not calibrated, fine, or coarse), or to go with the spectral ones (calibrated, uncalibrated, relative, normalized). We should decide, and specify which one.
  • em_ucd calls their values from the SSA proposal, but actually one less UCD is recommended from those specified by the SSA Recommendation. I think having arith.log qualifiers for em_ucd would help in the logarithmic cases, and spect.dopplerVeloc; arith.ratio can be used for c based redshift.
  • em_res_power should be specified as freq/delta_freq when em_ucd is em.freq, and energy/delta_energy when em_ucd is em.energy.
  • o_ucd should not be allowed to be NULL, if present. If it is, for me it means the dataset corresponds to a non-well known observable, and little software could work with it, so why put it in the VO?
  • o_units should follow the units proposal (including scaling) proposed by Pedro Osuna et al. in their note on accessing Spectral data using SIAP: http://ivoa.net/Documents/Notes/SADimEq/SpectrumAccess_DimensionalEquation-20040521.pdf
  • o_units must be present if any of o_detection_limit or =o_stat_error= are provided.
  • facility should be the same as in theVOResource medata for registering facilities, and should also be registered with the ADS: http://vo.ads.harvard.edu/dv/facilities.txt
  • instrument should also have a place in the Registry, but possibly this discussion falls outside of ObsTAP.

+ Material to help for implementation (proposed by Igor Chilingarian)

  • provide online files of Appendix C (TAP_SCHEMA tables, sql code example for tables initialisation)
  • provide online version of some queries exposed in the use-cases and obtained results

+ Some proposals for missing, or alternate UCDs (by JuanDeDiosSantanderVela)

  • access.estsize: could be phys.size; meta.file
  • s_region: could need a new one to support STC-based footprints, analogous to instr.fov; then, the UCD would be pos; instr.footprint.
  • obs_creator_name: could be a new one, meta.id.creator, similar to meta.id.PI, or meta.id.parent. Also, a primary/secondary (Q) meta.creation might be added, and meta.id; meta.creation be used.
  • o_ucd: could be meta.ucd; obs.
  • s_ucd: could be meta.ucd; pos.
  • em_ucd: could be meta.ucd; em.
  • data_rights: could be meta.curation; meta.code, or meta.code; meta.curation; however, both are Primary UCDs, which does not allow for the combination.
  • em_stat_error: could be stat.error; em.wl instead, as we will always use wavelength.
  • obs_title: could be meta.title; obs.
  • s_resolution_min: should be pos.angResolution; stat.min.
  • s_resolution_max: should be pos.angResolution; stat.max.
  • t_calib_status: could be meta.code.status; instr.calib; time.
  • facility: could be meta.id; instr.tel.
  • instrument: could be meta.id; instr.

+ Data model declaration

The way the TAPRegExt document currently is written, the text on p. 25 should be rewritten. I'd suggest the text starting with "Services that implement the ObsCore model..." should become

Services supporting the ObsCore model should advertize this fact using

    <dataModel ivo-id="ivo://ivoa.net/std/ObsCore/v1.0">ObsCore 1.0</dataModel>

as explained in the TAP registry extension document (Demleitner et al, 2011).

I don't feel strongly about either standardID or ivo-id. If you have strong feelings either way, please vent them over at TAPRegExt. The same goes for the element content (which is supposed to be a freetext label possibly for human consumption).

-- MarkusDemleitner - 11 Mar 2011

+ Some proposals of changes in the document (by Igor Chilingarian 2011/03/08)

Page 14. Section 3.2, paragraph 2 (or table caption -- unclear?) add "in some cases" after "though it could be nillable"

Page 14. Table 1 Among all the listed columns, there is only one which must be unique, it is "obs_publisher_did". So, I would recommend to highlight it somehow in the table and mention that it can serve as a primary key.

Personally, I'm not happy with some column names, in particular with the fact that "spatial" starts with s_, the same letter as "spectral". I know that this has been discussed for years and some of us are already used to these column names, but still I would suggest to use the top-level UCD in column names instead of a single character, like it is done for the spectral axis ("em"). I mean: "pos_ra" "pos_dec", ...; "time_min", "time_max".

All Section 3.3. I think we should explicitly mention whether it is mandatory to fill every given DM element (i.e. if it is "nillable" in the SQL-serialization). Although this information is provided below, I think it would be important to mention this in every 3.3.x section.

Page 17. Section 3.3.3, par.2 We should again mention that only obs_publisher_did is unique.

Page 18. Section 3.3.4, par.1 "A spectrum could be represented in the VO-compliant Spectrum format or in some instrument-specific FITS binary table format". "FITS binary table" should be dropped. Because this way we reject all the spectra in the "standard" IRAF-reduced format, i.e. 1D-image; we possibly reject Euro3D, and other instrument-specific file types (ESO FLAMES)

Page 21. Section 4.5. We should highlight that DID is unique and say explicitly that it should be used as a primary key.

Page 21. Section 4.6. The last paragraph (Access URLs...) is too vague. It doesn't bring any useful information, only frightens developers. Has to be clarified.

Page 21. Section 4.8. How to put "unknown" access_estsize? As NULL? Has to be specified

Page 23. Section 4.15 The statement that "In all cases, t_exptime is generally used as an indicator of the relative sensitivity (depth) within a single data collection" is not true. In case of multiple targeted observations (e.g. a survey of stars of certain types) the exposure time is usually selected on the case-by-case basis in order to achieve more or less the same signal-to-noise ratios for every target of interest.

Page 23. Section 4.16 "For products with no sampling along the time axis, the "t_resolution could be set to the exposure time." This is inconsistent with the Characterisation DM. I would recall on a long discussion during the CharDM (v.1) development a few years ago regarding the "resolution" value in case then a given "axis" contains only one point. I tried to argue for putting some resolution RefVal (i.e. lambda/dlambda for a broadband filter), but this idea met strong objections, in particular by Alberto, so finally we decided to put "N/A" or "NULL". I am not sure whether it was a right decision, but now we must comply to what is already done. In this case, the query to discover the time-resolved observations will be: WHERE t_resolution IS NOT NULL

Page 29. Use case 1.6 I have serious doubts that this query will work. "|" should be replaced with "abs", but "abs" does not exist for the "timestamp" datatype (although MJD can be treated as "double precision". This is an example to discuss and check.

Pages 37-38. Section B1.2. Calibration level. The first paragraph states about "80%" of data collections, while the last one is more optimistic saying "simple enough to cover all regimes".

Page 39. Section B3.3. "HI cube" or "CO cube" is not a title, it's more a data type. The title would be something like "NGC224 HI cube 01"

Page 41. Section B5.2 Some examples are quite bad, in particular, "VOTable". I think we should be more explicit about this. MIME types should be indicated as preferred, then VOTable will become text/xml or whatever like this.

Page 45. Section B7.1 Example is needed. What about "Telescope" or "Facility"? To me it looks quite strange if we give the instrument name, but do not identify the telescope it is operated at.

Page 47. Table 6. Identify obs_publisher_did as a primary key, highlight all mandatory elements (by colour??)

Page 51. Section C.2 My suggestion would be to put the tested and working bunch of SQL scripts in order to create and fill the database schema. This will facilitate the life of people implementing the service. I believe, Mireille has already put this point on the Twiki somewhere.


+ Points discussed by A.Richards 2011/03/09

Here are some comments - mostly minor:

4.4 Unique identifier - does this mean unique within the Collection Name?

I am not sure that a combination of real observations is a 'software' observation - I would reserve that term for simulated or model data. Back to real combinations, surely it is up to the provide whether it gets a new ID, but if the purpose of the ID is to allow tracing the observational history this seems counter-productive. Omit this para?

4.6 Does 'data product' include metadata-only responses, e.g. when the product is very large and the user should be warned or the data need staging or user registration at a web form or whatever? (the Introduction does mention 'retrieving or otherwise accessing')

4.12 ICRs only - see my first comment about the need to allow for SOlar System sources.

Thanks very much


-- DougTody

> I am not sure that a combination of real observations is a 'software' observation - I would reserve that term for simulated or model data. Back to real combinations, surely it is up to the provide whether it gets a new ID, but if the purpose of the ID is to allow tracing the observational history this seems counter-productive. Omit this para?

The point was that we talk about "observation", but a composite of multiple actual observations is a more complex derived product, and we want to describe that here too. Perhaps the wording could be improved; we will revisit it in light of your comments.

> 4.6 Does 'data product' include metadata-only responses, e.g. when the product is very large and the user should be warned or the data need staging or user registration at a web form or whatever? (the Introduction does mention 'retrieving or otherwise accessing')

ObsTAP does not address these more complex access use cases. More generally, ObsTAP does not do virtual data. This is reserved to the typed interfaces such as SIAV2. ObsTAP describes only "archival" (actual) data products in an archive. These can be directly retrieved as files, but to do anything more complex the typed DAL interfaces must be used.

- Doug

Edit | Attach | Print version | History: r12 | r10 < r9 < r8 < r7 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r8 - 2011-03-15 - FrancoisBonnarel
This site is powered by the TWiki collaboration platformCopyright © 2008-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback