EPN-TAP Proposed Recommendation: Request for Comments

EPN-TAP is a protocol used to describe and access data related to the study of the Solar System.

This document defines the EPN-TAP framework, which is using TAP with the EPNcore meta- data dictionary. The EPNcore metadata dictionary defines the core components that are necessary to perform data discovery in the Solar System related science fields. It includes keywords to describe data products coverage (temporal, spectral, spatial, photometric), origin (instrument, facility), content (target, physical parameters), access, references, etc. Its implementation with TAP (Table Access Protocol) is presented, including service registration guidelines. Topical extension metadata dictionaries are also presented.

Latest version of EPN-TAP can be found at:

Reference Interoperable Implementations

Indicate here the links to at least two Reference Interoperable Implementations)

  • All EPN-Core tables on VOPARIS-TAP-MASER TAP server are compliant.
http://voparis-tap-maser.obspm.fr/__system__/adql/query/form

=> select * from cassini_rpws.epn_core

=> select * from expres.epn_core

=> select * from voyager_pra.epn_core

=> select * from wind_waves.epn_core

...

  • The SPICAM service at LATMOS is compliant:
http://vo.projet.latmos.ipsl.fr/__system__/adql/query/form

=> select * from spicam.epn_core

  • The MPC service at Heidelberg is compliant:
http://dc.zah.uni-heidelberg.de/__system__/adql/query/form

=> select * from mpc.epn_core

Implementations Validators

taplint from STILTS versions >=3.4-2 checks all column metadata and content constraints:

   java -jar stilts.jar taplint tapurl=http://dc.g-vo.org/tap stages='tme epn'


Comments from the IVOA Community during RFC/TCG review period: 2021-09-30 - 2021-11-14

The comments from the TCG members during the RFC/TCG review should be included in the next section.

In order to add a comment to the document, please edit this page and add your comment to the list below in the format used for the example (include your Wiki Name so that authors can contact you for further information). When the author(s) of the document have considered the comment, they will provide a response after the comment.

Additional discussion about any of the comments or responses can be conducted on the WG mailing list. However, please be sure to enter your initial comments here for full consideration in any future revisions of this document


Section: Role within the VO Architecture (1.3)

  • the TAP reference is to TAP-1.0; maybe update that to TAP-1.1?
  • include in diagram and reference to DALI (latest) since a lot of the input/output syntax for TAP comes from there and you'll need to refer to DALI directly
  • less necessary: VOSI could be added tom diagram/ref, although the transitive dependency via DALI and TAP are probably OK
Section EPNcore Table (3)

The Type column in the table should be TAP data types: anything "Text" should be datatype "char" arraysize "*" (take care to allow finite arraysize without implementors and validators taking it literally), "Double" or "Float" should be "double" or "float" (exact TAP/VOTable datatype).

  • No, I disagree here. We had these exact types in previous specs, and they only caused all kinds of validation trouble for no operational gain at all. No client will bother whether a value is float or double, and individual operators may have good reasons for choosing one or the other. Similarly, there is absolutely no operational problem in choosing char(40) over text in a specific table, and the spec has no reason to outlaw this. Even more importantly, we should not close the door to pushing out unicodeChars or something equivalent on non-VOTable output. By the way, this minimal specification style is also used by RegTAP, ObsLocTAP, and hopefully anything else that does this kind of thing. -- MarkusDemleitner - 2021-11-09
  • I'm with Markus. However I have suggested an edit (EPNTAP PR#18) to clarify the meaning. (Parenthetically: this issue comes up in many documents, it would be nice to have some standard language to say "it's a string" e.g. endorsed by DALI) -- MarkTaylor - 2021-11-10
For s_region, you should use "polygon" from DALI-1.1 (not spoly) if that is specifically what you intend; there is also "circle" in DALI-1.1. Also, note that WD-DALI-1.2 defines "shape" which is polymorphic so you might want to specify s_region to be "any spatial extent described in DALI" to allow implementations to use "shape" in future. I think that's what "Note 3" is trying to say, but it's not clear. I would caution, however, that polymorphic types like "shape" in the model/API (tap_schema) are harder to implement so I would only do that if there are real use cases that require something other than "polygon". If you find the defintion of polygon in DALI too restrictive (e.g. with respect to required refr frames) this is an excellent time to bring it up in DALI so we can find a solution to support non-ICRS frames.

Also in s_region, the description mentions "celestial of body-related frames". Is that in reference to the spatial_frame_type column a little farther down? If so, I would cross-reference in both descriptions and put those two next to each other so the relationship is as clear as possible? (I purposely have not read the body text yet so this could be explained there, but so many people will skip to this descriptive table that it needs to be as robust as possible on it's own).

For the "shape" column that seems to be a prototyope st-moc, could you use a different column name there? Maybe: space_time_coverage? space_time_cov? st_coverage? With DALI-1.2 introducing "shape" as an xtype this has the potential to be confusing. Also, WD-DALI-1.2 introduces xtype="moc" for the MOC-1.4 ascii spatial moc so it will also specify an xtype for ST-MOC when that becomes a standard. It would be good to specify this column so that could be adopted later. Spatial moc in VOTable are datatype="char" arraysize="*" xtype="moc" and I would expect ST-MOCs to be datatype="char" arraysize="*":xtype="stmoc", so having people use datatype="char" arraysize="*" now (which is what you mean by "Text" there) and add the xtype later (just as metadata in tap_schema) looks like it will work out fine.

-- PatrickDowler - 2021-11-01


Comments by Markus Demleitner

(a) Now that Tom has pointed it out, I agree the use of word "parameter" in this standard is, well, nonstandard. Since it's probably too late to change from "parameter" to "column" throughout without almost certainly getting things wrong, I guess the best we can do is an info box (in an admonition environment) explaining the language. Poke me and I'll try a PR, but then I'd be grateful for input on just how "parameter" came to mean "column" in this standard.

(b) p. 7 I'd drop "As a general rule, a parameter can appear only once in a service table." I think what you're saying here is "Within a SQL table, no column name can occur twice", and that's indeed a general rule, but as a reader I'm wondering why it is states here (and not, say, the unfortunate case-folding rules of SQL, which probably will lead to quite a bit more confusion). If it's supposed to mean something else, then that thing needs to be stated more clearly.

(c) p. 11, measurement_type -- I'd prefer if the whole paragraph "The UCD1+ list from IVOA ... Request for Modifications" were dropped -- I don't think it's particularly helpful (I'm sure nobody will put in ancient UCD1 here), and it looks a bit odd in a normative document.

(d) p. 11, processing_level -- "this one should be used to meet the expectations of historical users" -- I'm not sure what "historical users" is intended to mean here, and I can't quite make out what the whole sentence says. Could you clarify?

(e) In the processing level table on p. 12, "Partially Calibrated" is translated as "2?" as the EPN code. I think that's an artefact of editing, since processing_level is an integer and hence there's no way to represent "2?". Could you clarify this?

(f) p. 13 "Data providers must be aware that services which do not use the IAU designations might not be accessible by the clients." -- this is supposed to say "that services which do not use IAU designations will not return the rows for that target to users querying by IAU target name", right? I'd rather see something like this rather than talking about "accessible", which in this case would, I claim, rather be understood as "users will be locked out" rather than "users won't get what they ought to get". Similarly, it shouldn't be "services containing... might not be visible", but in this case, just dropping the "services containing" would be essentially enough: "...that data of interest might not be retrieved if they do not use the recommended...".

(g) p. 14, the sample queries: Since you're (rightly) advising against using LIKE on target_class, I'd say you shouldn't show in it an example; just demonstrate the correct usage and be done with it (even more so since this isn't even supposed to demonstrate target_class). While we're talking about style: I'm always in favour of weaning people off "SELECT *". So, in case there's a plausible case here that would suggest only retrieving a few columns, writing these out would, I'd argue, improve the educational value of the example.

(h) p. 16, the time-constraining queries: Please remove the quotes. While SQL is weakly-typed enough that time_min > ’2455197.5’ will work as expected, there's a lot of luck involved, and we should educate our users to think of types and their literals. Hence, please make it time_min > 2455197.5.

(i) p. 17, the "Conversions" block in spectral_range: This seems like a leftover of some recipe to me. Can we drop it? If what you really want to say is "the max of the spectral range in wavelength is the min of the spectral range in frequency", I'd suggest to say that in prose; it's odd to see a spectral_range_max_micron come out of nowhere.

(j) For spectral_resolution, I'd again like to drop the "Conversions" part. See remark (i).

(k) p. 20 "use ``none'' if not applicable, although ``body'' is found in older services and may be OK" -- nah, please be clear on what you want. I'd say "body" with all-NULL ci columns wouldn't hurt, but perhaps a magic value for "no coordinates here" will make a few queries faster. Whatever it is, please make up your mind, or we'll end up with a combination of the disadvantages of both approaches. And if we want a NULL value, let's make it NULL -- this will then need to fixed in the DaCHS mixin, sure, but that's a minor problem.

(l) p. 20 "Current IAU frame is assumed" -- if this is supposed to mean: "assume whatever IAU frame is current at your time", you should explicitly state that (and I think that's a good idea even though that will retroactively change the semantics of existing tables as the "current IAU frame" evolves -- it's not unlikely that such changes will tend to improve existing services overall). If, on the other hand, you mean "IAU frame at the time of writing", you should say that and give a reference to where that is defined. Disclaimer: I admit that I only have a vague idea what "IAU frame" might refer to here.

(m) p. 20, the cartesian frame: I always wonder if the world agrees whether y is up and z is perpendicular to the plane or it's the other way round. If the world, as it seems to me, does not agree, it might be a good idea to state some preference here -- or state that we don't care.

(n) p. 22, the Implementation notes to s_region: these are actually talking about pgsphere, and hence I don't think this should be in EPN-TAP. Let's drop it here and rather expand tutorial coverage on how to pull in the various geometries and how to prep the database for the various possibilities.

(o) p. 22 "An observatory list is being developed in Europlanet-2024/VESPA and will be implemented in a resolver by merging existing lists (draft here: Observatory Facility Database)" -- I'm not a big fan of promising things in REC-s; most of the time things come out differently, and then these things are like sore thumbs in the standards. In this case, that reference doesn't actually help anyone to pick names that they may be able to keep. Can we drop that sentence and write something like "Regrettably, there is no consensus nomenclature for instruments and their hosts yet. EPN-TAP data providers are advised to pick their instrument identifiers from one of the following lists for the time being:" I'd suggest similar text in instrument_name on p.23.

(p) p. 23 "care must therefore be taken to use a title not already ascribed to another EPN-TAP service" -- well, if we want this, we have to give people a chance to figure out whether a name they want is already taken. There are a few options for that, among others:

* Prescribe the schema name as the service title and then tell people to select schema_name from rr.res_schema natural join rr.res_table where table_utype=(the epn-core id). * Use the table/title element in VODataService and tell people to select table_title from rr.res__table where table_utype=(the epn-core id). * keep things as they are and give a sample pyvo programme that illustrates how to get a list of existing service titles (that's a four-liner, so it's not that bad).

I don't have major preferences either way, but I insist that if we have uniqueness requirements we give a plausible way for people to somewhat fulfill them.

(q) p. 24 "and must provide an actual and unique link" -- I'd suggest to drop this. The "unique" part could be misunderstood as a requirement that the same URI can't be in a table multiple times (where I'm not sure why we should forbid that, and I certainly don't want to have to validate against such a requirement) or even that the same URI can't be in two different EPN-core tables. As to the "actual": what is that supposed to mean? Retrievable as in "gives a 200 HTTP response"? Do we really want to bind a service's validity to whether all of the data products it lists are actually retrievable? I have similar concerns on thumbnail_url on p. 25 ("and must provide an actual link").

(r) p. 24 "The link may be a call to a web service (e.g., PlanetServer _CRISM service) or the output of a script (e.g., Titan atmospheric profiles service); in both cases the link must include the adequate arguments." Another sentence I'd drop: you couldn't keep people from doing this if you wanted to, and I don't think there's any need to remind them they can do that.

(s) p. 24 "Datalinks [...] can be used for this later purpose, and must be provided via the datalink_url parameter." -- what's the rationale of outlawing datalinks as products? You see, if I have datasets that are a few gigabytes big, I usually point people to the datalink services, and I expect clients to do the right thing with them at some point. Sending a few gigabytes down the pipe to some unsuspecting client to me looks like a less desirable alternative to the occasional "I can't read this".

(t) p. 25 "Other parameters may be used..." -- I'm not terribly happy about pseudo-document structure half-implied through boldfaced normal paragraphs. The "other parameters" probably do not belong to "Data Access Reference", as its introduction makes "these 3 parameters" mandatory in some sense. So, can't this "Other parameters" thing be replaced by a \subsubsection{Miscellaneous File Metadata} or something like that?

(u) p. 25 "accref" -- I'd drop the entire paragraph; this is an implementation detail of the specific way DaCHS handles products and has no impact on other implementations of EPN-TAP, let alone the clients. In the unlikely case a DaCHS operator accidentally uses accref, it's up to DaCHS to do something sensible (which at this point it probably won't do, but we can't fix that from the EPN-TAP spec).

(v) p. 26 datalink_url -- this mentions dlmeta and dlget; these are again implementation details of DaCHS and should not be here. The text I think works reasonably well if you just take out the parentheses. I'd also take out the material starting with "Links can be parameterized" -- how people build the links into their datalink service is up to them (and, really, if it's a datalink service, there's going to be a parameter, ID).

(w) p. 26 datalink_url -- I think this should also briefly mention what the relationship of this column to the normal Datalink techniques for creating datalink is, perhaps: "This column allows operators to include pre-produced links to a Datalink service that are visible to clients not evaluating datalink service blocks \citep{2015ivoa.spec.0617D}. There are no requirements as to the relationship between URLs in datalink_url and links clients can generate from Datalink service RESOURCE-s." Or so.

(x) p. 26 -- I'd drop everything in the bib_reference paragraph after "A generic regular expression" on grounds that implementors of the standard probably have not much use for that, and ADS may find reason to change that pattern sooner or later -- we don't want to have to issue an erratum just for that. Instead, I'd say "References are best provided as bibcode\footnote{\url{http://adsabs.github.io/help/actions/bibcode}}, DOI, or arXiv identifier\footnote{\url{https://arxiv.org/help/arxiv_identifier}}, although other forms are acceptable."

(y) p. 26 To avoid a semi-recursive definition, I'd say publisher should be defined as "Specifies the operator of the data service, which may be different from the origin or the data."

(z) p. 28 "The same sources are used for the declaration file in the registry." -- what do you want to convey there? Is it something like "Values given here should also be given in the subject keywords of the service's registry record"?

(aa) p. 30 target_time says "Values in ISO 8601 format.", which you can't, because you're giving a SQL table schema. You could say "target_time is given as a TIMESTAMP", but before you do that: is there really a strong case why this can't use JD as well? You see, ADQL even in 2.1 will have virtually no functions for doing arithmetic with timestamps, so that there isn't even a standard way to say "target_time within a day of some event I get from somewhere else in the database" (you can say "target_time between '2021-10-20' and '2021-10-31'" or so, but that only works because you're writing the literals yourself). So, I claim you'd be doing everyone a favour if you used JD here, too.

(ab) p. 32 "In this case, the target_distance_min/max parameters can provide distances..." -- there's a TBC left over here, but more importantly: I'm really against making the meaning of some column dependent on the presence of other columns. That's really asking for trouble (full disclosure: it's also a nightmare in implementation -- how should I come up with the description of the target_distance columns with such a regulation?)

(ac) p. 33 the dynamical_class, dynamical_type, and taxonomy_code columns seem a bit unfinished and/or underspecified. Perhaps you ought to take them out of the current spec and let them mature outside of the REC? This would also save a Wiki reference, which in my view would be a big win. Also in the solar system objects extension, there's the reference ot the orbital parameters Wiki page with a simple "see"; I'd be a lot more relaxed if the spec said just what exactly implementors should see this page for: normative definitions (yieech!)? friendly advice? a discussion of possible problems?

(ad) p. 34 "map_height, map_width" -- the description here seems overly terse, and I cannot say I could confidently interpret the "/WCS" there. Is it supposed to mean "in world coordinates"? Since that would make the metadata a lot more complicated (and there's pixelscale, too), I'd say this should simply be "These parameters give the number of pixels along the two axes of a raster map. This specification does not constrain the orientation of such maps, and hence 'width' and 'height' do not correspond to physical directions."

(ae) p. 33 Section 2.3.4 "Contributive Works" again seems to be in a shape not yet ready for REC (starting with a missing explanation of PVOL). Perhaps this could mature outside of this REC, too?

(af) p. 34 "For dimensionless data (e.g., reflectance), units="" (VOunit standard)" -- I'd leave that out, as it is quite a bit beyond the purview of EPN-TAP. Leave this kind of thing to the SDM.

(ag) p. 34 "producer_name, producer_institute provide reference to who measured the sample (from Contributive Workds extension)". Well: Contributive works declares the two columns als "from other extensions", so it's a bit of an empty reference. But since I'd drop CW for now, why not just say "Name(s) and Affiliation(s) of the person(s) that created the data as free text" (or be specific and require a single name and a controlled affiliation if you want to make that searchable).

(ah) p. 34 "this is in practice the only way to ensure that we get all results" -- for me, that is a bit inappropriate for a spec (who is the "we"?). Moreover, if you make this hash-separated, you should probably recommend the use of ivo_hashlist_has rather than trying hand-crafted patterns, as these are not easy to get right.

(ai) p. 34 Is there a reference for "Dana or Struntz classification tags"?

(aj) p. 34 "Minor/trace components are not welcome here (would multiply false alarms)." -- again, that's not really spec language. It'll look much better if it says "To reduce the likelihood of false positives, minor and trace components should not be included in sample_classification". Oh, and an example for how values in that column should look like would be really welcome.

(ak) p. 34 "see if we define a code for", "see if negative values of angles can have a" -- please remove these clearly internal annotations; you could perhaps get away with "negative values are reserved for now". Incidentally, floating-point values have NaN, which mostly works fine for NULL or N/A.

(al) p. 35 "free string or hash-separated list" in several places -- that phrase, which appears here multiple times, does not make a lot of sense. On the one end, hash-separated lists are a subset of free strings, so from a provider perspective you're not granting additional rights. But you don't buy anything for consumers, either, as they cannot query under the assumption that there are items separated by hashes in there. I'd say: Just make it "free text" and don't pretend people can do more than free text searches in there.

(am) p. 36, "species is also available, see if usage is desirable (absence of a species..." -- again, this sort of stage directions should go from the REC, as should, I think, everything after "Special cases to handle" in the section.

(an) p. 37 APIS extension: Either pull the material from the Wiki page into the spec or take out the APIS section from 2.0 to let it mature outside of the REC; we don't want normative material on IVOA-external wiki pages.

(ao) p. 37 "The following parameters are proposed to handle VOevents (will comply to a future standard when ready)" is again not really appropriate language for a standard. If you want to keep the extension (rather than letting it mature outside of the REC), then this should probably be something like "VOEvents \citep{2006ivoa.spec.1101S} can be archived in EPN-TAP tables. The following extra parameters are available to complete their metadata:" or so.

(ap) p. 38 "instrument_host_name, instrument_name are expected to be “simulation”... " -- this will be less shocking if you start the sentence with "For VOEvents that are actually predictions, use...".

(ar) p. 45: "c1: only body and celestial are angular;" ummm: why not spherical?

(as) In terms of style, I'd prefer if there were commas throughout after "e.g.", and in the few places where that's still an issue, "The data provider must select the type most adapted to his data." would be re-written as "Data providers must select the type most adapted to their data." or so.

-- MarkusDemleitner - 2021-11-11


Comments from TCG member during the RFC/TCG Review Period: 2021-09-30 - 2021-11-14

WG chairs or vice chairs must read the Document, provide comments if any (including on topics not directly linked to the Group matters) or indicate that they have no comment.

IG chairs or vice chairs are also encouraged to do the same, althought their inputs are not compulsory.

TCG Chair & Vice Chair

Applications Working Group

In general the document seems clear, and it seems like it would be straightforward both to implement services and to query the services and consume results. I trust the interest group experts on the domain-specific details of the parameters/columns so I gave those only a cursory review. I'll echo the comment from Operations that the implementations need to pass validation.

I have the following fairly minor editorial comments and questions. Page numbers refer to those in the pdf:

  • Since this document is explicitly defining TAP services, I found the use of the word "parameter" confusing. Maybe add a sentence mentioning how "parameter" relates to the TAP notion of "column", either that they are effectively equivalent, or in what ways they differ?
    • To a lesser extent, the notion of keyword also seems essentially equivalent but also has just an implicit relationship to the "parameter" and "column", although "keyword" is used in a different way on p. 42 in the description of target_description.
    • For the EPNcore Table, the header for these parameters/columns/keywords is just "Name". Maybe a header of "Parameter Name" or "Column Name" would tie things together just a little more explicitly.

  • p. 5, paragraph 2 (in section 1.1) was unclear to me.
    • Should "consists in" be "consists of"?
    • The first sentence seems to say there will be one table per service, and the second seems to say there can be multiple. Can this be clarified?

  • p. 5, paragraph 3 (in section 1.1): Because of the spacing in the formatting, I might replace "/" with "and" in "A unique query can then be sent to / answered".

  • p. 5, paragraph 4 (in section 1.1): I don't know what this means: "Any optional parameter can be used whenever required."

  • p. 6 (in section 1.2), regarding "Some parameters can be multivalued...", it's implicitly clear due to the VOTable spec that this refers only to string-type parameters, but it might be worth making that explicit to confusion if numeric arrays are ever included.

  • For the Example TAP queries on pp. 11, 14, 16, 18, 47:
    • The single quotes as rendered in the pdf (but not the html) are the pretty kind that the ADQL parsers don't like. If there is a way to force the simple ASCII single quotes there, it would make cutting and pasting a little more useful.
    • I realize that it would be difficult to maintain a live service that responds nicely to the example queries for the life of the document, but even now it was not obvious to me where to find a service where they would run. I might have missed it, but I didn't see the "pipo.epn_core" table in any of the reference implementations. Would it be reasonable to say something like, "The example queries are for illustration purposes only. The exact table names and contents will vary depending on the service. At the time of writing, these queries will return results on the EPN TAP service at <one of the reference implementations>." Having runnable queries now would help validate that there aren't typos in the document's examples and could possibly help other early implementers. Including similar examples as service-provided examples on a reference implementation could also be helpful.

  • Appendix A says "No previous version yet." Since the Introduction mentions a previous version that was limited to test purposes, would it make sense to say something similar in Appendix A?
-- TomDonaldson - 2021-11-02

Data Access Layer Working Group

The EPNTAP standard is comprehensive and well presented. Its value to the solar system community is obvious given its strong uptake already. However sections 2.3.2 and 2.3.5 are incomplete and not suitable for a REC at this stage. These sections need to be completed or postponed to a later version before we could approve it.

1.1

  • Granule and parameter should be bolded, or preferably dot points when first introduced as they are key concepts. I share others concerns that the column or field would be more precise and more common usage than parameter
  • 5 The two levels described are unclear - these would be better as dot points or enumerated
  • PDS needs to be defined
  • 7 Section 4 should be mentioned
  • ObsCore standard should be cited
2
  • Sub-sections of 2 have inconsistent capitalisation
  • When parameter names are referred to in the text it would be useful to have them italicised, as is done in ObsCore.
  • Is the comment about this section being generated relevant anymore? What about the "To ignore the following section" that appears before many sections?
  • General, in earlier sections multiple parameters which are described as one (e.g. c1, c2, c3) are listed in separate paragraphs with the earlier ones missing descriptions, but in later sections they are provided as a comma separated list in the one paragraph (e.g. map_height, map_width) - it would be better to have a single convention. I find the comma separated list less confusing
2.1.1
  • Granule_uid - in 2.2.2 internal_reference, there is a statement that granule_uids can never contain the # character. That statement should reflected in this section too.
2.1.2
  • Should the dataproduct_type values/IDs be a vocabulary?
  • The processing_level table should have a table number and be referred to from the processing_level paragraphs
  • PSA and PDS4 are not defined
2.1.4
  • c3 (min/max) Example - should c1max be c1_max etc? The _min and _max suffixes are introduced in section 1.2 and should be consistently used throughout the standard. If this cannot be done then this section should mention this exception.
  • spatial_frame_type - cartesian - "This includes spatial coordinates given in km" Includes implies that there are other values accepted. If so then they should be listed, if not then reword as "Spatial coordinates given in km"
  • incidence (min/max) - does ellipsoid model need to be defined?
  • phase (min/max) - phase may range from -180 to 180 but the formula given shows a lower bound of |incidence - emergence| which would imply it must be positive.
2.1.5
  • instrument_name - The example is given in all upper case - does this parameter need to have a statement about case usage?
2.2

  • It might be helpful to have a statement about what optional means in this sense - can publishers not include the fields in their table, or must they be present but may be empty?
2.2.1
  • file_name - does file_name need to be unique for the FGI to work?
  • access_md5 - The (link to a real file) seems redundant
2.2.4
  • solar_longitude - is northern winter solstice -90 or 270 degrees? It is currently shown as +90
2.3.2
  • This section has a lot of editorial comments, TBCs etc - it is not in a REC-suitable state - it needs to be either removed for a later revision or completed
2.3.5
  • This section has a lot of editorial comments, TBCs etc - it is not in a REC-suitable state - it needs to be either removed for a later revision or completed
  • spectrum_type - the wording is awkward. I read "this page" to be referring to the document until I got to the end of the sentence. Suggest something like "A list of possible values is provided on the Lab spectroscopy extension[1] page, together with corresponding UCDs (which appear both under measurement_type and in the data files).
2.3.7
  • What does "will comply with a future standard when ready" mean? Is there anything not supported by VOEvent v2.0?
  • event_type: Where will the list of values be maintained?
2.3.8
  • Where will these other extensions be maintained? How will they be published? How would a provider know they exist?
3
  • The UCD pos.projection doesn't appear to be in UCDList 1.1 - is it intended to be a new one?
  • The order of parameters later in the table doesn't match the order they are presented in the text
4.1

  • Is the text "(from center)"? in the c3min and c3max rows of the body column intended
-- JamesDempsey - 2021-12-16

Data Model Working Group

The document tries to integrate solar system data into the VO context so it is an important one for completeness of data. There are some questions and editorial comments that I would like to share.

Some comments below:

  • The use of the term “parameter” is confusing (at least to me) in the document (this has been already raised by others but I have to agree). In general, we are making use of the e.g. column or field term to talk about the metadata and we reserve "parameter" for the query interface. I do not know how complex could be to do a substitution but it could help to use a more common term.
  • Role within architecture image is not in the preamble as in most (all?) of the IVOA documents and EPNTAP does not appear
  • About the terms dataset and data product, I do not see a problem of the term "granule" to be more agnostic but I miss more connections to the PDS data model as many of the planetary data I know is in PDS3/PDS4 format. It is nice to see that a table was added to do the comparison for the processing level but I miss more translations (whenever possible) to other metadata fields. That would be very useful for data providers to map their metadata with the EPN-TAP one
  • Close to the previous point, it would be nice to see the level of integration with NASA services and planetary data analysis tools. Which is the state at this point?

  • Section 1.3 -> UCD reference has not properly generated
  • In the example SELECT * FROM pipo.epn_core WHERE dataproduct_type LIKE ’%im%’: Why is needed the like and not use the exact =? It is not clear to me if the value is exactly “im” or “image”. If it is the first case the equals could be better for performance. If it is the second, the queries for LIKE `%sp%` are ambigous (I imagine it is the first option)

  • Processing level table is overflowing the page and without a title/reference
  • In this table, I would remove the question mark for "2?". If this is the decission, the standard must be assertive
  • In section target_name, the freedom of having different formats of the target name (e.g. "(1) Ceres" or "Ceres") makes the system more complex (not sure if the use of the LIKE operator ‘%name%’ could produce false possitives due to substring matches. If the client is going to use a name resolver like quaero providing the main IAU name, it could be simpler to ask servers to have this main identifier (I imagine there is a lot of historical background behind to allow a more flexible approach but it is a good opportunity to change this ambiguity during the standardization).
  • The resolving on the server could be more complex but feasible with the introduction of an ADQL name resolver function (e.g target_name(string) that provides the main identifier) that makes use of another service behind (this was tested in astronomy to allow ADQL queries by target name calling behind SIMBAD). In summary, I think that only allowing a main IAU id on the queries could make things simpler (and it is not so complex to create this synthetic column at the server side with the main identifier)
  • Section target_class: not clear to me what to put for the mentioned planetary_rings. Null?

  • Time(min/max): Just an opinion on this but in the example
  • select * from pipo.epn_core where time_min > ’2455197.5’ and time_max < ’2455927.5’
  • the example is a "contains" although, in my view, for discovery it could be better to write an overlaps:
  • select * from pipo.epn_core where time_min < ’2455927.5’ and time_max > ’2455197.5’

  • spectral_range conversions I would change c*E6 by c*1E6 as this notation looks misleading to me

  • spectral_sampling_step conversions
  • If operator * is used for multiplication I would substitute:
  • c*1E6 dlam by c*1E6 * dlam (Same for the other conversions)

  • c1, c2, c3 Something that puzzles me is why for these parameters the min and max are not appearing on the field name with underscores (c1_min and not c1min). It is always like this in all the standard (so it is consistent) but it is not consistent with the rest of fields
  • spatial_frame_type: I would modify the sentence on the “none” or “body” by default. If the default value is “none” the body value for all services could be just a page note (e.g. Old services have the body value as a default value but that could be deprecated at certain point) but it should not be in the text as a valid value because this could produce that the discrepancy is maintained in the future
  • s_region: I understand that different geometries are allowed for this value. However, it is not clear to me which system do the conversion: server or client? An explanation on how to handle conversions could be needed to facilitate implementations. Also, I would like to know how this service could be affected with ADQL 2.1 where the reference frame is dissapearing from the geometrical operator. I think a comment on this could be quite useful for future implementors
  • 2.3.2 First paragraph. Remove TBC for the standard version. If this is the decision (although it could be accepted that it is not a perfect one) the standard should be assertive. That applies also to other TBCs into the text
  • map_height, map_width. Although it is in the table, I would put the pix units here for these fields
  • grain_size: Assign the -1 code if needed but I would remove the non-assertive comment
  • 2.3.6 APIS extension. Acronym used for the first time but not into the acronym section of without a background here
  • EPNcore Table types: as commented by others, I would use a more standard way to assign data types (e.g. like in ObsLocTAP where the same discussion arised and an agreement was taken)
  • Some formating problems in the table for some fields like particle_spectral_sampling_step_Flmoaint
  • In the table I would try to prevent the blank empty page 44
  • Table 2: It looks strange that the spherical c1min/max is in meters when the rest of coordinates express the distances in km. I imagine there are historical reasons behind
-- JesusSalgado - 2021-12-10

Grid & Web Services Working Group

I agree on the fact that the document is well written and clear, I am not an expert of Solar System related studies so I cannot judge the details of the implementation. I agree with the fact that term “parameter” is a bit confusing in this contest. I see that in the document there are still some of the typos/suggestions identified by the other reviewers, as soon as all of them will be implemented/corrected I will approve it.

-- IVOA.GiulianoTaffoni- 2022-01-22

Registry Working Group

Firstly, the registry-related sections look good to me. Thank you for both the example registry record AND the standard record in GitHub at https://github.com/ivoa-std/EPNTAP/blob/master/epntap.vor , which can be updated when this goes to REC and included in the RofR.

I share several commenters' confusion around the use of 'parameter' referring to column information. I can see how changing it throughout the text might be a hardship at this point, but more clarification within in the document would be appreciated.

Pursuant to that, one of the Rules for parameters notes spaces are allowed internally in string parameters. Does this mean column content (I think it does), in which case I am fine with it, or column names, in which case we should note use of underscore as separator? This seems to be one of the areas that could use a little more clarification.

-- TheresaDower - 2021-11-12

Semantics Working Group

(a) p. 19, the planetary reference frames, and p. 28, spatial_coordinate_description: First, we would suggest to concentrate the discussion of frame and origin in one place (presumably 2.2.3) and then refer to that master place from the other place.

Then, for spatial_coordinate_description, we'd like to suggest that you at least mention http://www.ivoa.net/rdf/refframe (which is where ICRS would come from, I'd say). I give you that we probably should not include all the myriad reference systems on the Earth there (I seem to remember there are about 8000 of those). But frankly, I think exactly because of the large number of systems it would be valuable to have the "recommended" ones in refframe, and I'd be happy to work with you to include the ones already in use in EPN-TAP services. That way, I'd argue there is at least a bit of hope that clients have a chance to know a refframe in use in an EPN-TAP table, and that users know what they can compare against in their positional queries.

It would also be preferable if the SSIG didn't develop an altogether independent practice here, although it seems that your IAU20xx:49900 naming scheme (that I haven't researched) is already established, and that we certainly don't want to try an replate. Still, using refframe as a way to endorse certain frames from there for use in EPN-TAP where possible would, I think, make the whole thing quite a bit less unwieldy.

Anyway, both refframe and refposition with their hierarchies have plenty of space for solar system frames/refpos-es without clobbering the celestial frames.

I'm also fairly unhappy with having a wiki page referenced in a REC in a relatively central (and probably operationally relevant: I suppose spatial_coordinate_description will be constrained in essentially every positional query?) place. Can we pull that material somewhere more IVOA-controlled?

(b) p. 36 spectrum_type, "A list of possible values is provided on this page..." If you want this to be a normative word list, please turn it into a proper vocabulary. Tell us and we'll do it.

(c) p. 36 "measurement_type provides the type of measurement/scale as a UCD (REFF, I_over_F, etc...)" -- ummm... these are not UCDs, are they? Does this refer to UCDs that should still be created? If so, it would be good to have at least some RFMs in before this goes to REC, and the strings here should correspond to what you ask for there.

(d) In general, from a semantics perspective we would like to discourage the practice of overloading columns such that they have different UCDs in different "instances" of a model, as it happens here with the ci_* columns. I am fairly sure this will result in a lot of trouble down the road. A UCD is an expression that a column contains things of a certain sort ("elements of the concept"). Now, if you have relations that look as if you could build a union between them (because column names and types match) but really have different interpretations for what's behind the names -- which is what the differing UCDs say in the end --, you are inviting the creation of relations that are not well-defined (because for some of their elements, c1_min, say, means one thing and for others, some other thing).

For EPN-TAP, that ship has sailed a long time ago, so we'll have to see how things work out. For future standards, I'd suggest we try to avoid this. At least in the realm of database tables, I would hope that we are approaching a level of maturity where we can avoid this sort of thing by having multi-table models, and you'd have tables pos_sphere, pos_cyl, etc people can join when they want to do positional searches. But as I said: for EPN-TAP, I'm fine with trying it this way.

-- MarkusDemleitner - 2021-11-11

Data Curation & Preservation Interest Group

Education Interest Group

Knowledge Discovery Interest Group

Operations Interest Group

As one of the document authors, I am basically happy with the text. But before recommending acceptance, I would like to see one EPN-TAP service that passes all validation checks. -- MarkTaylor - 2021-10-04

Radio Astronomy Interest Group

  • Late comments from our Interest Group. Not required, but . just in case you find it interesting probably for a next version.
  • We have seen a typo in section 2.1.4 column time (min/max) -> "... the location where time is measured must be provided though the time_refposition parameter. " --> through
  • Figure 1 doesn't show EPN-TAP itself (or is it my rendering ?)
  • in section 2.1.2, the Notes of the processing_level tables claim that the table is a compilation from PSA, PDS4 and ObsCore. Could be useful to say who is what .
  • in section 2.1 for "measurement_type" the "obs.image" UCD value seems inconsistent with the statement "only UCDs related to physical quantities" can be used. The image type is given by dataproduct_type and measurement_type should be a photometric quantity or "counts" or whatever ?
  • In radio astronomy, for radar planetary observations one telescope acts as the transmitter, and another as the receiver, but only one instrument_host_name seems to be available. Is there a solution for that
  • We imagine that for solar system radioastronomers everything makes sense but for extra SS astronomers the large number of columns may be difficult to catch.
    • There are observations interesting for both SS and extra SS astronomers (it's the same "sky" we share). That"s why some common understanding could be interesting. Specially in the radio domain.
    • The structure of section 2.1 (mandatory parameters) is clear enough. Section 2.2 is less. We wonder if parameters there should not refer to the appropriate 2.1 subsection when it exists. Or alternatively the 2.2 section could be organized in parallel to 2.1
    • At the beginning of 2.3 (extensions) and overal introduction to all these extensions and their rationale could be useful (maybe with a few examples.
    • An UML/VO-DML diagram for the implicit datamodel (one class per subsection in 21) could help clarity in a future version

  • It seems that some colum names ("parameters") are very similar concepts with ObsCore but the name are generally different. I imagine renaming the parameters is something which will be destructive for SS applications. But in the future we couild imagine to use utypes which are currently unused in EpnCore to create more interoperability between ObsTAP/SIA and EPNTAP services. For example :
    • epncore "granule_id" is very close from Obscore "obs_publisher_did"
    • '' "measurement_type" '' "o_ucd"
    • " "processing_level" " "calib_level"
    • " "instrument_host_name" " "facility_name"
    • " "creation_date" " "obs_creation_date"
    • " "release_date" " "obs_release_date"
    • " "publisher" " "publisher_id"
    • " "ra" " "s_ra"
    • " "s_ra" " "s_dec"
  • In the future dataproduct_type values should be made consistent with the list in ObsCore. Both should rely on the new definition of dataproduct_type vocabulary adopted by the semantics working group. For ascendant compatibility we should have both the two letter symbols and obscore names as valid nomenclature
-- FrancoisBonnarel - 2021-12-06 -- MarkLacy - 2021-12-10

Solar System Interest Group

Theory Interest Group

Time Domain Interest Group

Standards and Processes Committee


TCG Vote : 2021-09-30 - 2021-11-14

If you have minor comments (typos) on the last version of the document please indicate it in the Comments column of the table and post them in the TCG comments section above with the date.

Group Yes No Abstain Comments
TCG        
Apps        
DAL        
DM        
GWS *      
Registry        
Semantics        
DCP        
Edu        
KDIG        
Ops *      
Radio        
SSIG        
Theory        
TD        
StdProc        


Topic revision: r25 - 2022-01-22 - GiulianoTaffoni
 
This site is powered by the TWiki collaboration platformCopyright © 2008-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback